The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Last modified: May 28, 2002
XML Articles and Papers. April - June 2001.

XML General Articles and Papers: Surveys, Overviews, Presentations, Introductions, Announcements

References to general and technical publications on XML/XSL/XLink are also available in several other collections:

The following list of articles and papers on XML represents a mixed collection of references: articles in professional journals, slide sets from presentations, press releases, articles in trade magazines, Usenet News postings, etc. Some are from experts and some are not; some are refereed and others are not; some are semi-technical and others are popular; some contain errors and others don't. Discretion is strongly advised. The articles are listed approximately in the reverse chronological order of their appearance. Publications covering specific XML applications may be referenced in the dedicated sections rather than in the following listing.

June 2001

  • [June 29, 2001] "XML for Data: Using XML Schema Archetypes. Adding Archetypal Forms to Your XML Schemas." By Kevin Williams (Chief XML architect, Equient - a division of Veridian). From IBM developerWorks. June 2001. ['In the first installment of his new column, Kevin Williams describes the benefits of using archetypes in XML Schema designs for data and provides some concrete examples. He discusses both simple and complex types, and some advantages of using each. Code samples in XML Schema are provided.'] "In my turn on the Soapbox, I mentioned in passing how archetypes can be used in XML Schema designs for data to significantly minimize the coding and maintenance effort required for a project, and to reduce the likelihood of cut-and-paste errors. In this column, I'm going to give you some examples of the use of archetypes in XML schemas for data, and show just where the benefits lie. What are archetypes? Archetypes are common definitions that can be shared across different elements in your XML schemas. In earlier versions of the XML Schema specification, archetypes had their own declarations; in the released version, however, 'archetypes' are implemented using the simpleType and complexType elements. Let's take a look at some examples of each. Simple archetypes are created by extending the built-in datatypes provided by XML Schema. The allowable values for the type may be constrained by so-called facets, a fancy term for the different parameters that may be set for each built-in datatype. It's also possible to create a simple type by defining a union of two other datatypes or by creating a list of values that correspond to some other datatype. For our purposes, however, the restrictive declaration of simple types is the most interesting. Let's take a look at some examples... This installment has taken a look at the use of archetypes in the design of XML schemas. You've seen that judicious use of archetypes, together with smart naming conventions, can make schemas shorter and easier to maintain. There's an additional benefit to using archetypes -- a little trick to ensure consistent styling of your information..." Note the reference to the author's book Professional XML Schemas [ISBN: 1861005474], from Wrox Press; released now/soon. For schema description and references, see "XML Schemas."

  • [June 26, 2001] "DAML Processing in Jess (DAMLJessKB)." From Joe Kopena. 2001-06-26. "This software is intended to facilitate reading DAML files, interpreting the information as per the DAML language, and allowing the user to query on that information. In this software we leverage the existing RDF API (SiRPAC) to read in the DAML file as a collection of RDF triples. We use Jess (Java Expert System Shell) as a forward chaining production system which carries out the rules of the DAML language. The core Jess language is compatible with CLIPS and this work might be portable to that system. A similar approach is taken by DAML API, they also hook RDF API into Jess. However the bridge they use between the two is a little different and at the moment less complete in at least the publicly available version. The basic flow of this library is as follows: (1) Read in Jess rules and facts representing the DAML language; (2) Have RDF API read in the DAML file and create SVO triples; (3) Take triples and assert into Jess' rete network in VSO form, with some slight escaping of literals and translation; (4) Have Jess apply the rules of the language to the data; (5) Apply the agent's rules, queries, etc. The bridge between RDF API and Jess is very simple: each triple is inserted more or less as-is into the knowledge base. A not insignificant help in this is Jess' relatively loose syntax constraints, very few characters need to be escaped to be valid. In Jess these are referred to as ordered slots. An alternative would be to build Jess' unordered (named) slots. This would require more preprocessing of the triples to determine relations. It might be more efficient but also might break down due to the cumulative nature of DAML/RDF -- facts about an object can be asserted at any time and don't neccesarily follow the template. In DAML/RDF it is ok to assert an arbitrary relation about an object at any time unless specifically stated otherwise. This might not mesh well with Jess' templating mechanism. We generally follow the methodology of the DAML/RDF/RDF-S KIF Axiomatization in building our rules. Each fact is asserted as the sentence (PropertyValue <predicate> <subject> <object>). This is sufficient to assert any RDF/DAML information, since all constructs boil down to an underlying set of triples..." Note from Joe Kopena on 'www-rdf-interest@w3.org', 2001-06-26: "At the moment I'm working on a project using DAML to exchange information between units (arguably agents). I'm using it to encode my data and for the ontologies which express what the data means. Recently I've been doing some work on taking in DAML through RDF API and feeding it into Jess (Java Expert System Shell) to be processed. The result is that the data gets treated as DAML as opposed to just RDF triples. . . I'm using DAML in very simple ways at the moment (not even comparable in a number of ways to RDF Schema), but the number of constructs processed is growing as I need them and the system seems fairly useful already. Comments, suggestions, questions, discussion are all welcome..." See "DARPA Agent Mark Up Language (DAML)."

  • [June 26, 2001] "An Axiomatic Semantics for RDF, RDF-S, and DAML+OIL." By Richard Fikesand Deborah L. McGuinness. (Knowledge Systems Laboratory, Computer Science Department, Stanford University). March 1, 2001. "This document provides an axiomatization for the Resource Description Framework (RDF), RDF Schema (RDF-S), and DAML+OIL by specifying a mapping of a set of descriptions in any one of these languages into a logical theory expressed in first-order predicate calculus. The basic claim of this paper is that the logical theory produced by the mapping specified herein of a set of such descriptions is logically equivalent to the intended meaning of that set of descriptions. Providing a means of translating RDF, RDF-S, and DAML+OIL descriptions into a first-order predicate calculus logical theory not only specifies the intended meaning of the descriptions, but also produces a representation of the descriptions from which inferences can automatically be made using traditional automatic theorem provers and problem solvers. For example, the DAML+OIL axioms enable a reasoner to infer from the two statements 'Class Male and class Female are disjointWith.' and 'John is type Male.' that the statement 'John is type Female.' is false. The mapping into predicate calculus consists of a simple rule for translating RDF statements into first-order relational sentences and a set of first-order logic axioms that restrict the allowable interpretations of the non-logical symbols (i.e., relations, functions, and constants) in each language. Since RDF-S and DAML+OIL are both vocabularies of non-logical symbols added to RDF, the translation of RDF statements is sufficient for translating RDF-S and DAML+OIL as well. The axioms are written in ANSI Knowledge Interchange Format (KIF), which is a proposed ANSI standard. The axioms use standard first-order logic constructs plus KIF-specific relations and functions dealing with lists.[1] Lists as objects in the domain of discourse are needed in order to axiomatize RDF containers and the DAML+OIL properties dealing with cardinality..." See "DARPA Agent Mark Up Language (DAML)."

  • [June 26, 2001] "Department of Defense Adopts StarOffice." By Peter Galli. In eWEEK (June 25, 2001). "In a significant win for open source desktop productivity suites, Sun Microsystems Inc. today announced that the U.S. Defense Information Systems Agency (DISA) would implement up to 25,000 units of its StarOffice 5.2 software. StarOffice, Sun's open source productivity application suite that includes word processing, spreadsheets, presentations, and database applications for the Solaris, Windows and Linux platforms, would replace Applix on more than 10,000 of DISA's Unix workstations at 600 client organizations worldwide, said Susan Grabau, the product line manager for StarOffice. DISA has already begun implementing StarOffice as the automation Unix desktop solution for its Global Command and Control System, she said. The deal had not cost DISA anything as there was no license fee associated with StarOffice, and the federal government already had extensive support contracts with Sun which would cover this implementation, she said...Sun is also on track to release StarOffice 6 later this year. Iyer Venkatesan, the senior product manager for StarOffice, told eWeek in late April that StarOffice 6 would include the recently finalized XML file format specifications, which would make file sharing far easier. 'Files will now be able to be saved in either an XML format or in the current binary format. The lets users easily share information across applications, and will simplify the importing and exporting of files from different programs while greatly improving file sharing and readability,' he said." See references in "StarOffice XML File Format."

  • [June 26, 2001] "An Introduction to XQuery. A look at the W3C's proposed standard for an XML query language." By Howard Katz (Fatdog Software). From IBM developerWorks. June 2001. ['Howard Katz introduces the W3C's XQuery specification, currently winding its way toward Recommendation status after emerging from a long incubation period behind closed doors. The complex specification consists of six separate working drafts, with more to come. This article provides some background history, a road map into the documentation, and an overview of some of the technical issues involved in the specification. A sidebar takes a quick look at some key features of XQuery's surface syntax. Code samples demonstrate the difference between XQuery and XQueryX and show examples of the surface syntax.'] "The W3C's XQuery specification has been in the works for a long time. The initial query language workshop that kicked things off was hosted by the W3C in Boston in December 1998. Invited representatives from industry, academia, and the research community at the workshop had an opportunity to present their views on the features and requirements they considered important in a query language for XML. The 66 presentations, which are all available online, came mainly from members of two very distinct constituencies: those working primarily in the domain of XML as-document (largely reflecting XML's original roots in SGML), and those working with XML as-data -- the latter largely reflecting XML's ever-increasing presence in the middleware realm, front-ending traditional relational databases. The working group is large by W3C standards (I'm told that only the Protocol Working Group has a larger membership). Its composition of some 30-odd member companies reflects the views of both constituencies. What's now starting to coalesce into final form is an XML query language standard that very ably manages to represent the needs and perspectives of both communities. The key component of XQuery that will be most familiar to XML users is XPath, itself a W3C specification. A solitary XPath location path standing on its own (//book/editor meaning 'find all book editors in the current collection') is perfectly valid XQuery. On the data side, XQuery's SQL-like appearance and capabilities will be both welcome and familiar to those coming in from the relational side of the world..." See references in (1) "XML Syntax for XQuery 1.0 (XQueryX) Published as W3C Working Draft" and (2) "XML and Query Languages."

  • [June 26, 2001] "Users Seek Web Services Clarity." By Jack McCarthy, Tom Sullivan, Eugene Grygo, and Cathleen Moore. In InfoWorld (June 22, 2001). "While industry vendors climb over one another to get to the top of the Web services heap, users are opting for caution until critical technology and business issues are resolved. Concerns about hazy pricing and potential interoperability problems have surfaced as vendors dash to differentiate themselves in the standards race. But far and away the biggest question looms over security... Stumbling blocks or not, major vendors this month plugged Web services and plowed full-steam ahead with initiatives. This week, Microsoft heralded the second beta of Visual Studio.NET and the .NET Framework, its tools for building Web services. Microsoft Chairman Bill Gates described Visual Studio.NET as the centerpiece development product of the .NET strategy. Sun Microsystems recently unveiled Sun Open Net Environment and this week teamed with Oracle to offer a kit for moving Windows code, data, and applications to Java 2 Enterprise Edition (J2EE). Lotus Development embraced the model by unveiling Workflow 3.0 to offer a graphical system for managing business processes that integrate with standards-compliant Web-based applications. Debuting at this week's DevCon show, the Workflow upgrade includes support for Java APIs, XML, and other standards, allowing developers to easily build Internet-based workflow applications. Available this fall, Workflow will also offer Lotus Sametime instant messaging and support for Linux. Lotus parent company IBM and Hewlett-Packard are also on board with the WebSphere application server and Core Services Framework, respectively. Analysts say that momentum is building but that users have time to sort through the hype and discover how Web services can benefit them... Behind the growing interest in Web services are the promises of cost savings in application development as well as more powerful e-business interactions when business processes are exposed. The model has already attracted many enterprises to set up limited systems as they wait for Web services to evolve. Ahead of the curve, Dollar Rent A Car Systems, based in Tulsa, Okla., has been one of the early adopters of Web services. The company set up a link from Southwest Airlines' Web site to Dollar's reservation system using Microsoft's SOAP (Simple Object Access Protocol) Toolkit and a Windows 2000 Server. Visitors can now rent a car from Dollar without leaving Southwest's site... The standards debate remains another unresolved Web services issue. XML, UDDI (Universal Description, Discovery, and Integration), SOAP, and WSDL (Web Services Description Language) 'are the Four Horsemen of Web services; everybody loves them,' said Dana Gardner, an analyst at Aberdeen Group in Boston. But the evolution of the standards will parallel what has occurred in other technologies in that 'there will be less agreement as people look for differentiation,' Gardner added."

  • [June 25, 2001] "E-book Project Highlights Role of DOI in Selling Digital Content." By Mark Walter. In Seybold Report: Analyzing Publishing Technology [ISSN: 1533-9211] Volume 1, Number 6 (June 18, 2001), pages 8-12. ['As standards for numbering and metadata come into focus, the "bar code for digital content" will be grease for the e-commerce distribution chain driving sales of digital goods.'] "The work of proving the viability of the DOI in a commercial e-book setting is the task of the DOI-EB initiative, and the first fruits of this pioneering project were unveiled late last month at a panel at the Book Expo America show in Chicago... The importance of the DOI-EB project is that it creates a public demo that helps educate the community about how DOIs can operate in the context of e-books. At the same time, it fleshes out the improvements that must be made to its system for all parties -- publishers, retailers and distributors -- to adopt the DOI. The project also has implications for fledgling efforts to get cross-vendor compatibility in digital rights management (DRM). Of the three areas where DRM vendors agree standardization can occur in the near term -- numbering, metadata and rights language -- the DOI-EB project identifies answers to two of those. Bob Bolick, the director of new business development at McGraw-Hill Professional and a leading participant in several of the DRM working groups, pointed out the importance of the project in that context: 'Because an identifier standard and metadata standard are key to achieving some form of interoperability among digital rights management systems and e-book formats, this DOI-EB work strikes me as one of the most important standards efforts occurring in our industry this year.' The DOI Foundation has also anounced its support for efforts to extend the original Indecs work to rights-related terms... it's important to look at how the DOI dovetails with other standards efforts and technology developments taking place on the Net. Here, too, we see encouraging signs. Norman Paskin, the IDF's director, has been an active ambassador for the DOI, acting as a liaison with rights-management committees in the Internet Engineering Task Force, WorldWide Web Consortium and MPEG. DOIs already can be expressed as Universal Resource Names (URNs), the IETF's syntax for generic resources, and the DOI is compatible with OpenURL, a proposed syntax for embedding parameters -- such as identifiers and metadata -- into hyperlinks. Paskin said he expects the DOI to shortly issue the DOI Namespace, its data dictionary for about 800 metadata elements spanning e-books, journals and audio and video material. The DOI has also started a working group to develop a services definition interface that would make DOI services available to a variety of Web-enabled systems. In short, although DOI was initiated by commercial publishers to help them sell intellectual property, its implementation is being carefully crafted to complement other standards and technology developments being developed for the Web and Internet at large." See also: "XML and Digital Rights Management (DRM)."

  • [June 25, 2001] "Government Data Standards Catalogue. Volume 1 - General Principles." By [UK] e-Government Interoperability Framework (e-GIF). Issue: 0.5 For Public Consultation, 15/05/01. "The [UK] e-Government Interoperability Framework (e-GIF) mandates the adoption of XML and the development of XML schemas as the cornerstone of the government interoperability and integration strategy. A key element in the development of XML schemas is an agreed set of data standards. The Government Data Standards Catalogue sets out the rationale, approach and rules for setting and agreeing the set of Government Data Standards (GDS) to be used in the schemas and other interchange processes. It also contains the standards agreed to date. These standards are also recommended for data storage at the business level. The Catalogue comprises 3 volumes: Volume 1 sets out the general principles, i.e., the rationale, approach and rules for setting standards; Volume 2 sets out the Data Types standards; Volume 3 sets out the Data Items standards. References: see "e-Government Interoperability Framework (e-GIF)." [source]

  • [June 25, 2001] "Government Data Standards Catalogue. Volume 2 - Data Types Standards. By [UK] e-Government Interoperability Framework (e-GIF). Issue: 0.5 For Public Consultation, 15/05/01. See previous entry for description. Data Types Examples: "Amount Sterling, BS7666 Address, Date, E-Mail Address, Forename, Individual Full Name, International Postal Address, Name Suffix, Postcode, Requested Name, Surname, Time, Title, UK Postal Address, UK Telephone Number." Volume 3 is not yet published (2001-06-25). References: see "e-Government Interoperability Framework (e-GIF)." [source]

  • [June 25, 2001] "OASIS Security Services Technical Committee Glossary.". Reference: 'draft-sstc-ftf3-glossary-00' (incorporates draft-sstc-glossary-00). 20-June-2001. 23 pages. "This document comprises an overall glossary for the OASIS Security Services Technical Committee (SSTC) and its subgroups. Individual SSTC documents and/or subgroup documents may either reference this document and/or import select subsets of terms. The sources for the terms and definitions herein are referenced in Appendix A. Please refer to those sources for definitions of terms not explicitly defined here. Where possible and convenient, hypertext links directly to definitions within the aforementioned sources are included. Some definitions are quoted directly from the sources, some are modified to fit the context of the OASIS SSTC (aka SAML) effort..." See (1) "Draft Documents Available for the Security Assertion Markup Language (SAML)" and (2) "Security Assertion Markup Language (SAML)."

  • [June 23, 2001] "Requirements for a Rights Data Dictionary and Rights Expression Language." In response to ISO/IEC JTC1/SC29/WG11 N4044: 'Reissue of the Call for Requirements for a Rights Data Dictionary and a Rights Expression Language -- MPEG-21, March 2001.' By [David Parrott] Reuters. 1 June 2001. Version 1.0. 62 pages. "This document describes Reuters requirements for a Rights Expression Language and Rights Data Dictionary (RDD-REL) in response to the call for requirements made by the MPEG-21 Requirements Committee... Digital Rights Management has for some time been closely linked with the technique of encrypting data files and managing the distribution and application of cryptographic keys in order to limit who can access the content and the manner in which access can take place. That technique is more appropriately labelled 'Digital Rights Enforcement' since it is more about enforcing rights than specifying and managing them. Moreover, even when enforcement is the goal, one might consider a whole array of implementation techniques which may or may not rely on encryption technology. In truth, the management of rights in the digital domain is far wider than the rather restrictive case outlined above. Rights (and obligations) management touches on numerous areas close to the hearts of many companies dealing in intellectual property (IP). Laying enforcement issues to one side, the value cannot be understated of simply being able to describe in a machine readable, standard format, the requirements of an IP owner on all other participants in the value chain. Those requirements can be described, broadly, as Rights and Obligations...A basic requirement for Rights and Obligations management systems to be successful is the ability to communicate Rights and Obligations in a standard form. Machine-readability is key to the dynamic specification of electronic contracts which is, in turn, critical to the dynamic construction of value-chains. A single Rights Expression Language should be common to all aspects of commercial activity. In that way alone, straight through rules processing is made possible. Rights and obligations can be created by different participants in the value-chain and layered upon each other. Data from different sources can be mixed freely without compromising the IP Rights of any of the rights holders. At the same time, the rights of individuals and downstream recipients of content must be protected..." Document source: see the posting from David Parrott (Reuters Limited) of 21-June-2001 to the XACML TC list, and the .ZIP file. Note relevant to the XACML TC discussion: "... I am forwarding FYI Reuters response to MPEG-21's call for requirements for their Rights Data Dictionary and Rights Expression Language. A key point to note is that the response describes a number of features to be included in MPEG's rights expression language that overlap with many of the "differences" I recently heard Simon list between XACML and DRM. These include: (1) fine granularity; (2) the use of rights expressions as policies to drive all manner of enforcement implementations (e.g., file system access, database access, services such as CORBA access, etc); (3) dynamically changing rights (not limited to static objects); (4) predicating rights of access on complex contextual information. There are many others. It would be useful to get people's thoughts on just how close the XACML and MPEG-21 activities are likely to become..." For background, see: (1) "MPEG Rights Expression Language (REL)"; (2) "Extensible Access Control Markup Language (XACML)"; and (3) "XML and Digital Rights Management (DRM)."

  • [June 23, 2001] "Digital Rights Management (DRM) Architectures." By Renato Iannella (Chief Scientist, IPR Systems). In D-Lib Magazine [ISSN: 1082-9873] Volume 7, Number 6 (June 2001). "Digital Rights Management poses one of the greatest challenges for content communities in this digital age. Traditional rights management of physical materials benefited from the materials' physicality as this provided some barrier to unauthorized exploitation of content. However, today we already see serious breaches of copyright law because of the ease with which digital files can be copied and transmitted. Previously, Digital Rights Management (DRM) focused on security and encryption as a means of solving the issue of unauthorized copying, that is, lock the content and limit its distribution to only those who pay. This was the first-generation of DRM, and it represented a substantial narrowing of the real and broader capabilities of DRM. The second-generation of DRM covers the description, identification, trading, protection, monitoring and tracking of all forms of rights usages over both tangible and intangible assets including management of rights holders relationships. Additionally, it is important to note that DRM is the 'digital management of rights' and not the 'management of digital rights'. That is, DRM manages all rights, not only the rights applicable to permissions over digital content. In designing and implementing DRM systems, there are two critical architectures to consider. The first is the Functional Architecture, which covers the high-level modules or components of the DRM system that together provide an end-to-end management of rights. The second critical architecture is the Information Architecture, which covers the modeling of the entities within a DRM system as well as their relationships. (There are many other architectural layers that also need to be considered, such as the Conceptual, Module, Execution, and Code layers, but these architectures will not be discussed in this article.) This article discusses the Functional and Information Architecture domains and provides a summary of the current state of DRM technologies and information architectures... For an example of a rights language, see the Open Digital Rights Language. ODRL lists the many potential terms for permissions, constraints, and obligations as well as the rights holder agreements. As such terms may vary across sectors, rights languages should be modeled to allow the terms to be managed via a Data Dictionary and expressed via the language... Second generation DRM software is now providing some of the Architectures described in this article in deployed solutions. A typical example from the E-book sector is the OzAuthors online ebook store. OzAuthors is a service provided by the Australian Society of Authors in a joint venture with IPR Systems. Their goal is to provide an easy way for Society members (including Authors and Publishers) to provide their content (ebooks) to the market place at low cost and with maximum royalties to content owners [example]... All of this information is encoded in XML using the ODRL rights language. This encoding will enable the exchange of information with other ebook vendors who support the same language semantics, and will set the stage for complete and automatic interoperability... DRM standardization is now occurring in a number of open organizations. The OpenEBook Forum and the MPEG group are leading the charge for the ebook and multimedia sectors. The Internet Engineering Task Force [IETF] has also commenced work on lower level DRM issues, and the World Wide Web Consortium held a DRM workshop recently. Their work will be important for the entire DRM sector, and it is also important that all communities be heard during these standardization processes in industry and sector-neutral standards organizations." See: "XML and Digital Rights Management (DRM)."

  • [June 23, 2001] "A Digital Object Approach to Interoperable Rights Management: Finely-grained Policy Enforcement Enabled by a Digital Object Infrastructure." By John S. Erickson (Hewlett-Packard Laboratories). In D-Lib Magazine [ISSN: 1082-9873] Volume 7, Number 6 (June 2001). "This article builds upon previous work in the areas of access control for digital information objects; models for cross-organizational authentication and access control; DOI-based applications and services; and ongoing efforts to establish interoperability mechanisms for digital rights management (DRM) technologies (e.g., eBooks). It also serves as a follow-up to my April 2001 D-Lib Magazine article, where I argued that the introduction of additional levels of abstraction (or logical descriptions) above the current generation of DRM technologies could facilitate various levels of interoperability and new service capabilities. Here I advocate encapsulating data structures of heterogeneous information items as digital objects and providing them with a uniform service interface. I suggest adopting a generic information object services layer on top of existing, interoperable protocol stacks. I also argue that a uniform digital object services layer properly rests above existing layers for remote method invocation, including IIOP, XML-RPC or SOAP. Many of the components suggested within this article are not new. What I believe is new is the call for an identifiable information object services layer, the identification of an application layer above it, and the clear mapping of an acceptable cross-organizational authentication and access control model onto digital object services... One aspect of the previously missing infrastructure was an object serialization, or structured storage, model that could be readily adopted across applications and platforms. We now have that model with the emergence of XML. In general, an advantage that data models with explicit structure have is that they naturally accommodate mechanisms for binding policy expressions to structural sub-trees within the information object hierarchies they represent. My focus here is on fine-grained policy expression and enforcement. Or, perhaps more accurately, policy expression at an appropriate level of granularity, since it is clear that not all object behaviors may require uniquely expressed policies. Generally, policy expression concerns the creation of tuples relating subjects, objects and actions, where in this context a 'subject' can be (loosely) thought of as a requestor for a service, an 'object' as a specific service (or behavior) of an information object, and an 'action' as some permissible action..." See: "XML and Digital Rights Management (DRM)."

  • [June 22, 2001] "CTO Forum: Ballmer Pushes .NET, XML for Web Services." By Matt Berger. In InfoWorld (June 22, 2001). "Speaking to a room full of chief technology officers and other industry executives about Microsoft's new vision for building and delivering its software, Chief Executive Officer Steve Ballmer attempted to explain why the company's .NET initiative and all the software products built around it will enable the next generation of business and the Internet. Software will no longer be packaged and sold to customers on a CD, and applications will no longer be static programs that sit on a desktop or run off of a server, Ballmer said during a speech Thursday at the InfoWorld CTO Forum here. Instead, he said, they will be delivered over the Internet as services that allow customers to interact with them dynamically... Using many of the same phrases from earlier presentations on the subject, Ballmer called XML the 'lingua franca of the Internet,' saying it will drive the evolution of the Internet and Web services. 'This is the XML Revolution,' he said. 'I think this will be as big or even bigger than any revolution that preceded it. 'This is why XML lies at the heart of Microsoft's .NET initiative, Ballmer said, adding that Microsoft has begun to incorporate support for XML in every part of its product line, from servers to desktop software to development tools, and the company is trying to convince partners, customers, and developers to do the same. It signals a new strategy from Microsoft that it is betting all of its chips on XML as the standard for developing its software to deliver new applications and Web services, said Steve Jurvetson, managing director of Silicon Valley venture firm Draper Fisher Jurvetson, who attended the event. Microsoft's decision to embrace XML, as well as support from other parts of the software industry, will pay off in the long run, said Tim Bray, the co-inventor of XML, who attended the CTO Forum as a representative of his new company Antarcti.ca Systems... XML is built into Microsoft's forthcoming Windows XP operating system, for example. The latest release of its Office productivity suite, Office XP, also incorporates hints of how Microsoft plans to use XML, such as its Smart Tags function, which delivers information from the Web via hyperlinks within applications. The company has also made XML an integral part of its Visual Studio.NET developer products and the .NET Framework. Microsoft delivered beta 2 versions of both of those products to developers this week at its TechEd conference in Atlanta..."

  • [June 22, 2001] Security Assertions Markup Language. Core Assertion Architecture. Version 09. 20-June-2001. Edited by P. Hallam-Baker. Contributions by Phillip Hallam-Baker, Tim Moses, Bob Morgan, Carlisle Adams, Charles Knouse, David Orchard, Eve Maler, Irving Reid, Jeff Hodges, Marlena Erdos, Nigel Edwards, and Prateek Mishra. From the OASIS SSTC and SAML work. "This document contains two sections. Section 1 contains the text proposed by the Core Assertions and Protocol group for the Core Assertions section of the SAML. Section 2 contains references to the material cited in the text. SAML specifies several different types of assertion for different purposes, these are: (1) Authentication Assertion: An authentication assertion asserts that the issuer has authenticated the specified subject. (2) Attribute Assertion: An attribute assertion asserts that the specified subject has the specified attribute(s). Attributes may be specified by means of a URI or through an extension schema that defines structured attributes. (3) Decision Assertion: A decision assertion reports the result of the specified authorization request. (4) Authorization Assertion: An authorization assertion asserts that a subject has been granted specific permissions to access one or more resources. The different types of SAML assertion are encoded in a common XML package, which at a minimum consists of: (1) Basic Information: Each assertion must specify a unique identifier that serves as a name for the assertion. In addition an assertion may specify the date and time of issue and the time interval for which the assertion is valid. (2) Claims: The claims made by the assertion. This document describes the use of assertions to make claims for Authorization and Key Delegation applications. In addition an assertion may contain the following additional elements. An SAML client is not required to support processing of any element contained in an additional element with the sole exception that an SAML client must reject any assertion containing a 'Conditions' element that is not supported. (3) Conditions: The assertion status may be subject to conditions. The status of the assertion might be dependent on additional information from a validation service. The assertion may be dependent on other assertions being valid. The assertion may only be valid if the relying party is a member of a particular audience. (4) Advice: Assertions may contain additional information as advice. The advice element may be used to specify the assertions that were used to make a policy decision. The SAML assertion package is designed to facilitate reuse in other specifications. For this reason XML elements specific to the management of authentication and authorization data are expressed as claims. Possible additional applications of the assertion package format include management of embedded trust roots [XTASS] and authorization policy information [XACML]..." See: "Security Assertion Markup Language (SAML)."

  • [June 22, 2001] "Shibboleth Specification." DRAFT v1.0. Shibboleth Working Group Specification Document. 'draft-internet2-shibboleth-specification-00'. May 25, 2001. "This document provides the specifications for the Shibboleth system, including interfaces, message specifications, etc. This document should define Shibboleth in sufficient detail that (1) someone can implement the system without having to guess or interpret what was intended, and (2) separate but compliant implementations are guaranteed to interoperate.... The Shibboleth Model differs from the SAML model in a several key ways. It can be described as: (1) The SHIRE uses the WAYF Service to locate the User Home Organization. The WAYF produces a BLAH. (2) The SHIRE will send an Attribute Query Handle Request to the Handle Service (HS) to obtain a reference to the user. The HS will use the local web authentication mechanism to authenticate the browser user. However, instead of generating a Name Assertion, the HS will generate an attribute query handle (AQH - an opaque user handle), and return it in an Attribute Query Handle Response. Only the Attribute Authority will be able to map the AQH to a specific user. (3) The SHAR will send an Attribute Query Message to the Attribute Authority. The SHAR cannot ask for specific attributes; rather, the query should be understood to mean "give me all the attributes you can for this user for this target". The Attribute Authority will return an Attribute Query Reponse, containing assertions for all of the attributes it is authorized to release for this target. The Attribute Authority will likely obtain the attributes from the origin site's pre-existing Attribute Repository (e.g., Directory). (4) The Resource Manager will make an access decision, based on the supplied attributes, the target resource, and the requested operation. It will then either grant or deny access. It will not produce an Authorization Decision Assertion..." See the "Definition and explanation of SHAR/AA attribute request and response messages" with W3C XML Schema: "This document describes possible XML message formats for Shibboleth attribute request and response messages passed directly or indirectly between the SHAR and AA components of the Shibboleth architecture. The formats are expressed in the XML Schema Definition language." [Shibboleth, a project of MACE (Middleware Architecture Committee for Education), is investigating technology to support inter-institutional authentication and authorization for access to Web pages. Our intent is to support, as much as possible, the heterogeneous security systems in use on campuses today, rather than mandating use of particular schemes like Kerberos or X.509-based PKI. The project will produce an architectural analysis of the issues involved in providing such inter-institutional services, given current campus realities; it will also produce a pilot implementation to demonstrate the concepts."]

  • [June 22, 2001] Network Data Management - Usage (NDM-U) For IP-Based Services. Version 2.5. April 12, 2001. 62 pages. Chief Editor: Steve Cotton (Cotton Management Consulting). "This document, in conjunction with the referenced Service Definition documents, is intended to specify technical information that is sufficient for practical implementations of interchange of usage data among service elements participating in the delivery of IP-based services, either within a single enterprise or across multiple enterprises. The IPDR organization intends to submit this specification to selected accredited organizations for consideration as an approved standard. This specification is divided into three major chapters: (1) IPDR Reference Model - a definition of the abstract and operational relationships between entities involved in the generation, recording, storage, transport, and processing of usage attributes. (2) Business Requirements - a definition of business requirements to be addressed by the protocol specification and specific scenarios for the major process flows anticipated in actual application. (3) Protocol - the notation, data unit syntax, and dynamic procedures involved in the operation of the interfaces specified in the reference model. IPDR stands for the Internet Protocol Detail Record, the name comes from the traditional telecom term CDR (Call Detail Record), used to record information about usage activity within the telecom infrastructure (such as a call completion). NDM-U stands for Network Data Management - Usage. It refers to a functional operation within the Telecom Management Forum's Telecom Operations Map. The NDM function collects data from devices and services in a service providers network. Usage refers to the type of data which is the focus of this document. Introduced in NDM-U 2.0, Service Specifications define the fields that should be present in IPDRDocs for each class of service. For example, the usage data captured for a Voice over IP call is very different from a query made to a Content-hosting Application Service Provider, so each requires its own Service Specification. The formal definition language for Service Specifications is XML DTDs [and XML Schema]. Service Specifications are updated or inaugurated to reflect changes in industry practice and new-generation capabilities that can roll out every month in the Internet world. Version 2 summary: "This revision introduces a major upgrade of the syntax notation of the protocol, namely XML Schema versus XML 1.0. This upgrade has been introduced to allow the protocol to specify strong typing of the usage attributes, thus conforming to the business requirements for data integrity. In addition, the dynamic operation of IDPR document transport has been specified, using the consensus choice for best conforming to business requirements, Simple Object Access Protocol (SOAP). Finally, the usage attributes for each of the services defined in the Business Requirements chapter are now formally specified, using the XML Schema definition supplied in the Protocol chapter." See also the [extracted] XML schema, perhaps also online as a separate document. References: "IPDR.org Network Data Management Usage Specification."

  • [June 22, 2001] "Soapbox: Why XML Schema beats DTDs hands-down for data. A look at some data features of XML Schema" By Kevin Williams (Chief XML Architect, Equient - a division of Veridian). From IBM developerWorks. June 2001. ['In his turn on the Soapbox, info-management developer and author Kevin Williams tells why he's sold on XML Schema for the structural definition of XML documents for data. He looks at four features of XML Schema that are particularly suited to data representation, and he shows some examples of each. Code samples include XSD schemas and schema fragments.'] "As you're no doubt aware, the W3C recently promoted the XML Schema specification to Recommendation status, making that spec the XML structural definition language of choice. While most people find the specifications a little hard to read, the jargon conceals a very strong set of features, especially for those of us who are designing XML structures for data. I'd like to take a look at a few of those features. Strong typing is probably the biggest advantage XML Schema has over DTDs, and it is the aspect of XML Schema you've heard the most about. In a DTD, you don't have a whole lot of choices for constraining the allowable content of your elements and attributes... [Conclusion:] I've taken a brief look at some aspects of XML Schema that make schemas much better than DTDs for the definition of XML structures for data. While DTDs are likely to be around for a while yet (there are plenty of legacy documents that still rely on them for their structural definition), support for XML Schema is quickly being implemented for all the major XML software offerings. In the following months, I'll take a look at some of the ideas I've laid out here in greater depth in my forthcoming column." Article also in PDF format. For schema description and references, see "XML Schemas."

  • [June 21, 2001] "Progressing the UN/CEFACT e-Business Standards Development Strategy." From United Nations Centre for Trade Facilitation And Electronic Business (UN/CEFACT). UN/CEFACT Steering Group (CSG) E-Business Team. General CSG eBTeam/2001/EBT0001 16-June-2001. "The UN/CEFACT Plenary endorsed the proposed strategy for achieving its e-Business vision 1 at its March 2001 meeting. Subsequently, the UN/CEFACT Steering Group (CSG) and OASIS announced the successful completion of the development stage of ebXML and reached an agreement for the allocation of responsibility for maintenance and further development of ebXML specifications. Under the agreement, UN/CEFACT will be responsible for Business Processes and Core Components. OASIS will be responsible for maintaining and advancing a series of technical specifications. Jointly, UN/CEFACT and OASIS will be responsible for marketing and developing the technical architecture specification. The CSG believe the most effective way forward is to bring together the expertise and resources of the UN/EDIFACT Working Group (EWG), the Business Process Analysis Working Group (BPAWG), the Codes Working Group (CDWG), and the Business Process and Core Component work from the ebXML initiative. The result is the consolidation of all these efforts into a new Working Group, the e-Business Working Group, that will be able to address the needs of all its users. This initiative will require considerable planning and consultation if it is to achieve its objectives within the projected time scale. To lead this process, the CSG has established a special e-Business Team to undertake the initial coordination and development work. This paper is the first deliverable of the e-Business Team. It provides the description and responsibilities of the new e-Business Working Group. ['This paper is intended to provide a notional description of the proposed e-Business Working Group and likely responsibilities. It is by no means fully inclusive of all requirements that will eventually be identified. It is intended to establish a baseline and context within which meaningful discussion and alternative proposals can be developed. All aspects of the organisation as well as the various duties will be confirmed through the approval of Mandates and Terms of Reference for the e-Business Working Group and each subgroup.'] See the communiqué from Ray Walker, "UN/CEFACT's Proposal for a New Electronic Business Working Group." References: "Electronic Business XML Initiative (ebXML)."

  • [June 21, 2001] "Augmented Metadata in XHTML." Sun Microsystems Working Draft 21-June-2001. Edited by Murray Altheim (Sun Microsystems) and Sean B. Palmer. Draft version for feedback ('work in progress'). Abstract: "This specification describes several minor syntax modifications to XHTML (the XML transformation of HTML) which provide much of the essential functionality required to augment Web pages with metadata as found in published descriptions of the Semantic Web. This augmentation allows Dublin Core metadata, a highly popular standard developed by the library community to be incorporated in Web pages in a way that is compatible with today's Web browsers, and describes a generalized mechanism by which other popular schemas can be used in similar fashion. The metadata can be associated with any XHTML or XML document or document fragment (actually, any addressable resource), internal or external to the document." Detail: "This specification describes three minor modifications to XHTML 1.1 which provide much of the essential functionality required to augment Web pages with schema-characterized metadata, as according to the need expressed in published descriptions of the Semantic Web. Using the extensibility provided by the W3C Recommendation Modularization of XHTML, this specification includes an 'XHTML Augmented Metadata 1.0 DTD' that implements these features. The first two modifications are relatively trivial, in terms of implementation: (1) allow the <meta> element to appear within any block element as metadata about its parent (i.e., any major document component); (2) add an optional href attribute to the <meta> element to allow it to point to any addressable resource. The third modification is to: (3) add a Dublin Core module to XHTML, modifying the content model of the <meta> element to contain its content. [From the post to 'www-rdf-interest@w3.org': "I've been hesitant to announce this since it's not quite finished, but since you asked, here's a specification in the works that describes how to incorporate Dublin Core metadata within XHTML, so that Web pages can be harvested for their subject, author, etc. content. How this might occur is described in section 5.5.3. You'll note that this doesn't put RDF of any flavour into a Web page. That couldn't be validated, which is one of the requirements of the project, and in terms of being globally useful, allowing every author in the world to create their own flavour of metadata isn't a particularly compelling need; we all need to agree on using the same "carrier" with a small number of controlled vocabularies. Dublin Core fits this bill as a very popular way of capturing a subset of the kinds of metadata described in things I've read about the Semantic Web. There's also a section on how to work this with topic maps..." Related references in (1) "Dublin Core Metadata Initiative (DCMI)" and in (2) "XHTML and 'XML-Based' HTML Modules."

  • [June 21, 2001] "DAML-S: Semantic Markup For Web Services." By David Martin. 2001-05-23 or later. "The Semantic Web should enable greater access not only to content but also to services on the Web. Users and software agents should be able to discover, invoke, compose, and monitor Web resources offering particular services and having particular properties. As part of the DARPA Agent Markup Language program, we have begun to develop an ontology of services, called DAML-S, that will make these functionalities possible. This white paper describes the overall structure of the ontology, the service profile for advertising services, and the process model for the detailed description of the operation of services. We also compare DAML-S with several industry efforts to define standards for characterizing services on the Web... DAML-S is an attempt to provide an ontology, within the framework of the DARPA Agent Markup Language, for describing Web services. It will enable users and software agents to automatically discover, invoke, compose, and monitor Web resources offering services, under specified constraints. We have released an initial version of DAML-S. It can be found at the URL: http://www.daml.org/services/daml-s. We expect to enhance it in the future in ways that we have indicated in the paper, and in response to users' experience with it. We believe it will help make the Semantic Web a place where people can not only find out information but also get things done." See the document DAML-S 0.5 Draft Release (May 2001): "This directory contains a draft version of the DAML-S language under development by a group of DAML researchers. We encourage feedback from interested parties. DAML-S is a DAML-based Web service ontology, which supplies Web service providers with a core set of markup language constructs for describing the properties and capabilities of their Web services in unambiguous, computer-intepretable form. DAML-S markup of Web services will facilitate the automation of Web service tasks including automated Web service discovery, execution, composition and interoperation. Following the layered approach to markup language development, the current version of DAML-S builds on top of DAML+OIL (March 2001), and subsequent versions will likely build on top of DAML-L." See "DARPA Agent Mark Up Language (DAML)."

  • [June 20, 2001] "XML Blueberry Requirements." W3C Working Draft 20-June-2001. Edited by John Cowan (Reuters). Latest version URL: http://www.w3.org/TR/xml-blueberry-req. Abstract: "This document lists the design principles and requirements for the Blueberry revision of the XML Recommendation, a limited revision of XML 1.0 being developed by the World Wide Web Consortium's XML Core Working Group solely to address character set issues." Detail: "The W3C's XML 1.0 Recommendation was first issued in 1998, and despite the issuance of many errata culminating in a Second Edition of 2001, has remained (by intention) unchanged with respect to what is well-formed XML and what is not. This stability has been extremely useful for interoperability. However, the Unicode Standard on which XML 1.0 relies has not remained static, evolving from version 2.0 to version 3.1. Characters present in Unicode 3.1 but not in Unicode 2.0 may be used in XML character data, but are not allowed in XML names such as element type names, attribute names, processing instruction targets, and so on. In addition, some characters that should have been permitted in XML names were not, due to oversights and inconsistencies in Unicode 2.0. As a result, fully native-language XML markup is not possible in at least the following languages: Amharic, Burmese, Canadian aboriginal languages, Cantonese (Bopomofo script), Cherokee, Dhivehi, Khmer, Mongolian (traditional script), Oromo, Syriac, Tigre, Yi. In addition, Chinese, Japanese, Korean (Hangul script), and Vietnamese can make use of only a limited subset of their complete character repertoires. In addition, XML 1.0 attempts to adapt to the line-end conventions of various modern operating systems, but discriminates against the convention used on IBM and IBM-compatible mainframes. XML 1.0 documents generated on mainframes must either violate the local line-end conventions, or employ otherwise unnecessary translation phases before and after XML parsing and generation. A new XML version, rather than a set of errata to XML 1.0, is being created because the change affects the definition of well-formed documents: XML 1.0 processors must continue to reject documents that contain new characters in XML names or new line-end conventions. It is presumed that the distinction between XML 1.0 and XML Blueberry will be indicated by the XML declaration..." See the 'www-xml-blueberry-comments' mailing list archives and related references in "XML and Unicode."

  • [June 20, 2001] "Microsoft's Ballmer: .NET is About Integration." By Michael Vizard and Mark Jones. In InfoWorld Issue 25 (June 18, 2001), pages 20-22. "As part of an ambitious effort to create an architecture that fosters data and application integration, Microsoft has laid out a broad foundation based on XML technologies that will be marketed under the name of Microsoft.NET. In an interview with InfoWorld Editor in Chief Michael Vizard and West Coast News Editor Mark Jones, Microsoft CEO Steve Ballmer, who will be a keynote speaker at the InfoWorld CTO Forum this week, talks about how he sees this 'bet-the-company' strategy paying off for Microsoft customers and its industry allies. [Q: Why should corporations pay any attention to Microsoft.NET today?] Ballmer: There's a ton of information that is essentially locked in back-office systems today. We want to help [companies] bring that information together in new applications. We want to help them expose the information to the consumer. The way we would propose doing that is to essentially wrap it via XML and then build next-generation applications that pull things together using the XML infrastructure. This is about enterprise application integration. This is about business-to-business. This is about unlocking, getting knowledge of back-office systems to front office. [Q: What's the core business model behind Microsoft.NET?] Ballmer: We will build software, servers, and tools that have .NET-and XML-platform capability built-in, and we will sell those as we sell software today. We will also have a set of services that you should think of as sort of customer-facing as opposed to developer-facing. These will be advanced services for consumers and knowledge workers that use an XML data store that the user has running on the Internet. These additional services on top of that somebody might subscribe to as part of Windows or on top of Windows or on top of Office, etc. We will also charge developers some [sort of] fixed fee to use our services per year because there's real operational costs in serving a developer. But we don't have any model under consideration that calls for transaction fees and that sort of thing... [Q: What are the major points of difference between you and Sun about the role of XML?] Ballmer: XML is a message format, but it also implies a programming model. I don't send you a Java program that you run. I send you an XML message and you send me back an XML message. Yes, it's a data exchange format, but it is also the backbone for the way you write loosely coupled applications that extend one another and complement one another and work together. I don't think Sun gets that, frankly. Or maybe they do get it but strategically it is inopportune for them to get it..."

  • [June 20, 2001] "Microsoft Fires .NET Arrows at Java." By Tom Sullivan and Ed Scannell. In InfoWorld Issue 25 (June 18, 2001), pages 17-20. "Just two weeks after rival Sun Microsystems and its Java partners ballyhooed Web services at JavaOne, Microsoft will fire back this week with its own salvo. The company will use its TechEd developers conference in Atlanta to extol the advantages of integrating enterprise systems with Web services. Bill Gates, Microsoft chairman and chief software architect, will announce the availability of Visual Studio.NET beta 2. The final version, planned to ship later this year, allows developers to create components with native XML interfaces that can interoperate with other Web services. Microsoft will bolster its heavy bet on XML -- via its .NET initiative -- with a demonstration of Yukon, the forthcoming SQL Server version. Yukon's Web services-applicable features include XML processing and the ability to store XML natively in the database. When it ships, Yukon will support multiple languages within the database via the Common Language Runtime (CLRT) so any language that supports CLRT can be stored in the database... [MS'] Flessner will kick off the conference Monday, explaining how XML Web services solve the enterprise integration problem, whether that is system-to-system integration or feeding data from one application to a variety of Internet access devices. "There is important business value that can be derived from connecting directly to your partners and customers," said Barry Goffe, group manager for Microsoft .NET. The company is determined to outgun rivals IBM and Sun by providing a better implementation of XML-based Web services standards in its .NET tools and servers. With its CLRT woven tightly into Visual Studio.NET, Microsoft believes its technology will have broader appeal than Sun's because developers can write in any language they choose, not just Java. Microsoft CEO Steve Ballmer said Microsoft is betting on XML against just one language. 'Java is inadequate and the way that applications will be extended will be by responding to XML messages. It won't be by sending somebody a Java program,' he said. Microsoft's rivals expect it to distort the open standards, notably UDDI (Universal Description, Discovery, and Integration), SOAP (Simple Object Access Protocol), and WSDL (Web Services Description Language)... Other companies are not convinced that the XML-based Web services that any of the vendors are selling offer the best means to build Web services-like functionality. Ameritrade, an online brokerage in Omaha, Neb., is using BEA's Tuxedo at the middleware layer and Java to deliver its brokerage services. 'We are putting components in at the middleware level, as opposed to doing it at the XML level,' said CIO Jim Ditmore..."

  • [June 20, 2001] "RealNetworks Pushes Copyright Initiative." By Melanie Austria Farmer and Jim Hu. In CNET News.com (June 20, 2001). "Streaming-media giant RealNetworks on Wednesday unveiled new technology intended to promote the legal use of copyrighted material over the Web. The company is aiming the software in its RealSystem Media Commerce Suite at media companies and retailers that want to deliver music, movies and other copyrighted material securely over the Web. The software can be tied into existing systems for delivery of digital content. RealNetworks also introduced an initiative to provide a common, open standard--called XMCL, for Extensible Media Commerce Language--that would enable the content to be played on systems from different providers of digital entertainment. Supporters include media and technology notables such as IBM, Napster, InterTrust, Metro-Goldwyn-Mayer, Sony Pictures Digital Entertainment and Sun Microsystems. The moves are likely to heighten the already intense competition between RealNetworks and Microsoft, both of which distribute technology that allows consumers to watch videos or listen to music over the Web... Microsoft countered Wednesday with its own set of announcements. The Redmond, Wash.-based software giant unveiled Microsoft Producer, a system that lets people incorporate Windows Media audio and video technology into their business presentations. In addition, the company said it will begin highlighting how media and entertainment companies such as EMI Recorded Music, Viacom's CBS NewsPath and Lions Gate Entertainment are using its Windows Media digital rights management system... The control of copyrighted materials online falls into the realm of digital rights management, which will play an increasingly important role as online music becomes more popular with consumers. Content producers such as record labels and movie studios have generally acknowledged the Internet as a new way to sell and distribute their works. But the lack of safeguards preventing the unwanted dissemination of their works has made content providers more conscious of copyright abuses on the Internet. Thus, many content companies have proceeded slowly, waiting for a sufficient way to secure their works... The XMCL proposal envisions a way for digital content to be played independently of rights management systems and codecs. Codecs are the mathematical codes that compress large audio files into smaller, more usable packages that can be streamed or downloaded over the Web." See: "Extensible Media Commerce Language (XMCL)."

  • [June 20, 2001] "RealNetworks Unveils Digital Rights Standard, Products." By 'Reuters'. In InternetWeek (June 18, 2001). "Media software maker RealNetworks Inc. on Wednesday launched a new product it says will help entertainment conglomerates manage and track the use of their copyrighted material on new online services they plan to soon roll out. RealNetworks also unveiled an initiative to standardize the delivery of content via the Web in a way that is secure and profitable, marking what analysts said was the Seattle-based company's boldest move yet to tackle a main strength of competing technology by cross-town rival Microsoft Corp... Although the more technical of the announcements, Real's standardization initiative could pose a bigger threat to Microsoft, analysts said. Real's proposed standard is called the eXtensible Media Commerce Language, or XMCL, a media-oriented version of the XML (eXtensible Markup Language) standard that companies like Microsoft are betting heavily on to enable a new generation of Web-based services. Just as XML describes different types of data so different computer systems can talk to each other, XMCL would be a common language for describing the rights and rules for a piece of media like a song or a film, Albertson said... The other pillar of Real's strategy is a product called the RealSystem Media Commerce Suite, which will let online music and video stores easily package, sell and deliver their wares to customers, Albertson said..." See: "Extensible Media Commerce Language (XMCL)."

  • [June 20, 2001] "Big Guns Take Aim at Digital Copyright Management." By Sumner Lemon and Stephen Lawson. In InfoWorld (June 20, 2001). "Backed by some of the biggest names in the online entertainment industry, RealNetworks on Wednesday announced the formation of the XMCL (Extensible Media Commerce Language) Initiative. The company said the initiative will define an open XML-based framework for managing rights to digital media, including applications such as purchase, rental, video-on-demand, and subscription services. The list of companies that are backing the XMCL Initiative includes media-industry heavyweights such as Bertelsmann, EMI Group, Metro-Goldwyn-Mayer Studios (MGM), and AOL Time Warner. But Microsoft, which has its own digital-rights management framework built around the Windows Media Format 7 file format, is conspicuously absent from the list. Digital rights management technologies allow copyright holders to control how movies and songs are used and distributed online. Also, the technologies can restrict the number of times a user can play a certain file, or prevent a file from being copied and passed on to other users. XMCL will simplify rights management by letting content providers define business rules in a standard way, RealNetworks said in a statement. Specific details of how XMCL would be implemented were not made available... RealNetworks announced the XMCL Initiative at the same time it launched its RealSystem Media Commerce Suite, a suite of multimedia content applications. The software will eventually support XMCL and give users the ability to choose from a variety of back-end platforms, the company said. RealSystem Media Commerce Suite can be integrated with third-party digital rights management applications, such as flexible rights management software from InterTrust Technologies, the statement said. InterTrust, which has filed a patent infringement suit against Microsoft over digital-rights management in Windows Media Player, is a member of the XMCL Initiative..." See the announcement, and XMCL main reference page.

  • [June 20, 2001] "Making RDF Syntax Clear. Proposal of a DTD and minor syntax enhancement to RDF, to overcome many of the current practical difficulties." By Rick Jelliffe (Topologi Pty. Ltd.) 2001-06-20. "The current RDF Recommendation is almost impossible to implement because the discipline of a DTD was not used. Consequently, RDF implementations lack exchangability, and most people coming to the RDF Spec (from outside the 'RDF Community') expecting clear description of syntax must go away disappointed/ Furthermore, the advent of RDFS raises compatability issues, in that certain elements are used in RDFS, but are only general names in RDF. This proposal suggests the the situation could be improved by: [1] creating a normative DTD for RDF; [2] state clearly that this DTD (and DTDs that use it) embodies the RDF exchange current RDF exchange XML; [3] reconciles the use of namespaces in RDF with XML Schemas; [4] clarifies RDF's current syntax with standard concepts such as "architectures". I propose that this DTD should be included as a normative part of the RDF specification, and the BNF sections removed or reworded to fit in with it. From the 2001-06-20 posting to 'www-rdf-interest@w3.org': "I have posted to the RDF comments list a proposal for clarifying RDF syntax. This proposal features a new DTD, used to map between RDF documents and a notional XML Schemas schema using xsi:type. I have been working through RDF specifications and examples again recently, and I am even more convinced than ever that getting the basic discipline of the transfer syntax clear is a prerequisite for RDF becoming useful..." See "Resource Description Framework (RDF)."

  • [June 20, 2001] "Simplified XML Syntax for RDF." By Jonathan Borden (Tufts University School of Medicine, The Open Healthcare Group). June 17, 2001 or later. "A simplified XML syntax for RDF is proposed. Its major differences with RDF 1.0 are: (1) namespace = http://www.openhealth.org/RDF/RDFSurfaceSyntax; (2) defined as tree regular expression; (3) attribute aboutQ="ex:name" accepts QName as value indicating subject; (4) attribute resourceQ="ex:value" accepts QName as value indicating object; (5) rdf:parseType="Resource" is default; (6) The subject or object of a statement may be either a URI reference, a qualified name, a quantified variable, another statement or a collection of statements; (7) ?x defines a quantified variable. XML Syntax: The XML syntax for RDF 1.0 can be described in terms of a tree regular expression. This form can be thought of as expressing constraints on the XML Infoset which arises when parsing an RDF document. The advantage of expressing the syntax in this form over EBNF, is that a tree regular expression (e.g., RELAXNG/TREX schema http://relaxng.org) already takes into account the rules of XML syntax + XML namespaces, e.g., correctly handles namespace prefixes, empty elements, mixed content, whitespace, attribute ordering etc. Such schemata are also described as 'hedge regular expressions' or 'hedge automata' [http://www.oasis-open.org/cover/hedgeAutomata.html]. The tree regular expression schema for RDF 1.0 is available [online]. This schema handles several proposed updates such as the requirement that the "rdf:about" and "rdf:ID" attributes be prefixed/qualified. A tree regular expression for the proposed syntax is available [online]..." See: "Resource Description Framework (RDF)" and "RELAX NG."

  • [June 20, 2001] "RELAX NG schema for W3C XML Schema." Prepared by Jeni Tennison. Posted to 'relax-ng-comment@lists.oasis-open.org' on 20-Jun-2001. Comments: "I think that the XML Schema vocabulary is quite a neat showcase for RELAX NG because there are so many co-dependencies between attributes and between attributes and elements. This RELAX NG schema follows the XML Schema for XML Schema to a certain extent (using the same kind of naming scheme) to facilitate comparison between the two. I have also added comments about the ease with which the two handle different aspects of the vocabulary. I've tested it with Jing against various XML Schemas, and it seems to be working, though obviously if anyone spots any bugs please get in touch..." See: "RELAX NG." [cache 2001-06-20]

  • [June 19, 2001] "XML Training Wheels. An XSLT and Java-based tool for producing tutorials -- custom-built for developerWorks but ready to adapt to your own use." By Doug Tidwell (Cyber evangelist, developerWorks). From IBM developerWorks. June 2001. ['See how developerWorks produced a custom XSLT application with Java-based open-source tools that automates the tedious work of producing the developerWorks HTML-based tutorials. Known as the Toot-O-Matic, the tool now is available for any developer either to inspect as an XSLT exemplar or to tailor to your own training needs. Doug Tidwell explains the design goals and the XML document design. He also describes how the 13 code samples demonstrate the techniques used in generating a truckload of HTML panels full of custom graphics, a ZIP file, and two PDF files from a single XML source document.'] "Here at developerWorks, we're pleased to release the source of the Toot-O-Matic, the XML-based tool we use to create our tutorials. In this article, we'll discuss the design decisions we made when we built the tool, talk about how you can use it to write your very own tutorials, and talk a little bit about how the source code is structured. We hope you find the tool useful, and that it will give you some ideas about how to use XML and XSLT style sheets to manipulate structured data in a variety of useful ways... In achieving the final goal of seeing how much we could do with XSLT, the Toot-O-Matic exercises all of the advanced capabilities of XSLT, including multiple input files, multiple output files, and extension functions. Through the style sheets, it converts a single XML document into: (1) A web of interlinked HTML documents; (2) A menu for the entire tutorial; (3) A table of contents for each section of the tutorial; (4) JPEG graphics containing the title text of all sections and the tutorial itself; (5) A letter-sized PDF file; (6) An A4-sized PDF file; (7) A ZIP file containing everything a user needs to run the tutorial on their machine... This discussion of the Toot-O-Matic tool illustrates the full range of outputs that you can generate from a single XML file. The structure of our original XML documents enables us to convert flat textual information into a number of different formats, all of which work together to deliver a single piece of content in a variety of interesting and useful ways. Using this tool, we have shortened and streamlined our development process, making it easier, faster, and cheaper to produce our tutorials. Best of all, everything we've described here is based on open standards and works on any Java-enabled platform. The Toot-O-Matic tool shows how a simple, inexpensive development project can deliver significant results." Also available in PDF format. For related resources, see "Extensible Stylesheet Language (XSL/XSLT)." [cache]

  • [June 19, 2001] "Ferrets and Topic Maps: Knowledge Engineering for an Analytical Engine." By James David Mason Ph.D. Reference: Y/WPP-011. Paper presented at XML Europe 2001 (Paris). "The 'Ferret' analytical engine, developed originally by the Y-12 National Security Complex of the U.S. Department of Energy to seek classified data and associations in documents and present its findings in the light of formal rules, requires a structured information base that represents not just individual facts but a set of implications and a collection of rules. The fundamental knowledge base is evolving towards forms that enhance flexibility and portability. The developers early realized that the knowledge base can be captured in XML by a series of trees that represent taxonomies, analytical structures, and specific indicative facts, but over this a topic map is needed to express links across the trees. Above this, the classification rules could form another topic map that points into the lower layers. In its latest form, however, the knowledge base has come to be entirely represented in a topic map. The 'Ferret' engine combines sophisticated searching with rule-driven analysis and reporting. In its original application, the Ferret engine performs the equivalent of 5,000 simultaneous searches while reading documents at several thousand words per second. The analysis traces implications of concepts discovered in searching and applies the rules for interpreting implications and the actions to be taken when a significant piece of information is found. Because the topic maps that represent this knowledge can be switched easily, Ferret can be reprogrammed to many tasks, including selection and categorization, scanning of e-mail and newsfeeds, diagnostics, and query expansion, in addition to the original classification application..." [From the Conclusion:] "When we began work on the Ferret system, our goal was simply to construct a tool to help the ADCs review documents. . . The first knowledge base was actually based on one derived from the slow prototype we had studied. We realized that design was not maintainable and moved from it to our earliest XML representation. We eventually realized we needed to divorce the knowledge base from any connection to legacy technologies and to concern ourselves only with capturing the intellectual relationships among its components. By treating the Ferret engine as a black box and building the knowledge base using the XTM model, we have achieved a form in which the base will be both portable and maintainable, as well as potentially usable for more than simply controlling the Ferret engine. Even as the knowledge base has evolved, we have been rethinking the uses of the Ferret technology. Besides using it for its original purpose as an ADC's assistant, we have already used it for categorization projects and for scanning e-mail. We believe that with appropriate knowledge bases, Ferret could serve as a diagnostic tool or a mechanism for expanding queries. We are considering extending the reporting mechanism to write out new topic maps as the engine analyzes documents. The new topic maps might assist us in representinganalytical results in processes like classification, or they could serve as indexes for searching the documents that have been analyzed. If we are able to merge generated topic maps with those already in a knowledge base, we believe that we will have created an engine that is self-training within certain domains. As the topic-map technology gains acceptance and support, topic-map tools fromother sources may appear that we can integrate with the Ferret engine, creating even more interesting tools. Conversion of the knowledge base structure from its original form to topic maps is, I believe, the key to future growth of uses for our analytical engine..." Noted in JMason's trip report (Report of Official Foreign Travel to Germany 17 May-1 June 2001): "I presented a paper on the use of topic maps for building the knowledge base for the Ferret classification engine developed by Y-12. I had previously presented a preliminary approach to an XML knowledge base at an August 2000 GCA conference in Montréal. The current approach represents the entire knowledge base in the XTM application; the paper was well received." See: "(XML) Topic Maps." [cache]

  • [June 19, 2001] "Document Object Model (DOM) Level 3 XPath Specification Version 1.0." W3C Working Draft 18-June-2001. Edited by Ray Whitmer (Netscape/AOL). Latest version URL: http://www.w3.org/TR/DOM-Level-3-XPath. "The W3C DOM Working Group has published a first public Working Draft of the Document Object Model (DOM) Level 3 XPath Specification. This is the result of discussions from the 'www-dom-xpath' mailing list, feedback from the 'xml-dev' mailing list, and work within in the W3C DOM Working Group." The draft specification "defines the Document Object Model Level 3 XPath; it provides simple functionalities to access a DOM tree using XPath 1.0. This module builds on top of the Document Object Model Level 3 Core." Background: "XPath is becoming an important part of a variety of many specifications including XForms, XPointer, XSL, CSS, and so on. It is also a clear advantage for user applications which use DOM to be able to use XPath expressions to locate nodes automatically and declaratively. But liveness issues have plagued each attempt to get a list of DOM nodes matching specific criteria, as would be expected for an XPath API. There have also traditionally been object model mismatches between DOM and XPath. This proposal specifies new interfaces and approaches to resolving these issues..." Available as a single HTML file; also in PDF and Postscript formats. See: "W3C Document Object Model (DOM)." [cache]

  • [June 19, 2001] "Style sheets can write style sheets too. Making XSLT style sheets from XSLT components." By Alan Knox (Software Engineer, IBM, Hursley Park, Hampshire, England). From IBM developerWorks. June 2001. ['XSLT style sheets can be used to dynamically transform XML to complex presentation markup for browsers -- but if the presentation is complex, the style sheet will be too. What's needed is some tool that can build complex style sheets from simple components. Since XSLT is itself an XML, XSLT can be manipulated with XSLT; style sheets can write style sheets. This article shows how an XSLT style sheet that performs some particular runtime transformation can be built from XSLT components.'] "Another developerWorks article, 'Spinning your XML for screens of all sizes,' discusses problems with writing and managing style sheets that present the same XML basketball statistics on many display devices. The solution involved writing a parameterized style sheet that produces HTML with varying degrees of data content, and then transcoding the output from that style sheet for a specific device using WebSphere Transcoding Publisher. This is an effective and easy solution for many scenarios, but you lose some control over what appears on a user's screen. If you want: (1) Absolute control over what users see, (2) To tune presentation of your application to give the best possible experience on each device, (3) To make use of particular features of a device... Then you have to solve the problems of generating numerous, complex style sheets. This article demonstrates a no-compromise solution that uses the same basketball XML data... XSLT is a declarative language, where the templates that make up a style sheet are independent of each other. XSLT style sheets can be composed from other style sheets using the import and include mechanisms. With suitable care, you can independently develop a number of separate component style sheets that can be put together to make the presentation style sheet that will be applied to the XML data at runtime. These components will be broadly of three types, that deal with: Presenting dynamic XML data [the basketball data in my example]; Reusable bits of presentation mark-up, such as button bars; The residue of the page..." Article also available in PDF format. For related resources, see "Extensible Stylesheet Language (XSL/XSLT)."

  • [June 19, 2001] "Freddie Mac, Fannie Mae Agree To XML Standard." By Robert Bryce. In Interactive Week (June 18, 2001). "Freddie Mac and its larger cousin, Fannie Mae, are fierce competitors in the secondary mortgage market. But the two companies could make paperless real estate purchases a reality, thanks to their support for Web-based mortgage transactions. Both organizations have agreed to support eXtensible Markup Language (XML) standards established by the Mortgage Industry Standards Maintenance Organization. That support will likely mean dramatic changes for hundreds of companies, from lenders to vendors, that participate in the domestic housing market. That market is expected to generate roughly $1.52 trillion worth of new mortgages this year. But the mortgage process is slowed and made more expensive by the reams of paper needed for loan applications, credit reports, surveys and other documents... Freddie Mac estimates that moving to the MISMO standard could reduce origination costs for each loan by up to $700 - savings that would be passed on to consumers. With 2000 revenue of $30 billion and $44 billion, respectively, Freddie Mac and Fannie Mae are the monsters of the mortgage business. Together, they buy about two-thirds of all conventional single-family home mortgages. The two companies are government-sponsored enterprises created by Congress - and given access to federally backed lines of credit - to help promote home ownership. They buy mortgages from lenders, package them as securities and resell them on the open market to investors... Like many other companies, Freddie Mac and Fannie Mae want to minimize the amount of paper they handle. The MISMO was created in January 2000 to create an XML format that could be used from the initial loan application through the '"securitization' of that loan on Wall Street and the servicing of the loan by vendors. 'The obstacle is getting consensus across the industry for what we call "data",' said Gabe Minton, senior director, industry technology, at the Mortgage Bankers Association of America, which oversees the MISMO. The current version of the MISMO standard enables the translation of 2,000 terms, including the borrower's name and address. But as the standard evolves and more paper-based elements of the mortgage process are converted into XML, the number of terms that the standard will define will likely exceed 4,000, Minton said..." See "Mortgage Bankers Association of America MISMO Standard."

  • [June 19, 2001] "The Microsoft Shared Development Process." Microsoft white paper. June, 2001. "The Microsoft Shared Development Process (SDP) provides a mechanism for fast, focused and profitable collaboration on key technology initiatives between Microsoft and industry partners... As we enter a new paradigm in computing, we find an increased call for integration and interoperability, which requires an even closer working relationship across companies and across industries. The emergence of XML-based web services as the new computing model signals a shift away from standalone applications and networks - disconnected islands of information - to one where constellations of applications, devices and services work together. This shift in the computing model requires a change in the way we design and build technology. It's no longer enough to build standalone functionality; we also have to focus on how a particular technology works with others. While XML Web services provides the new integration methodology, we also need new ways for the industry to come together to tackle new challenges. The SDP is designed to provide an easy, flexible and reusable process for Microsoft and industry partners to collaborate. The SDP is structured on the assumption participants are motivated by business success and any cooperation has the objective of growing the industry and expanding profitable opportunities. Unlike projects developed under some open source licenses, the SDP is respectful of intellectual property and will balance goals of protecting intellectual property rights with other goals encouraging widespread adoption of the new technology developed under the SDP... The SDP is designed to let Microsoft and third parties determine how best to address a common computing problem or challenge. Broadly speaking, the SDP scope has three categories which address different models of cooperation dynamics and intellectual property models... Type 3 projects involve cooperation across the industry to enable a better technology solution for many companies and their customers. Examples of this project type might include the development of industry-wide XML schema that describe a common set of data to be shared across applications in a given industry segment. In these cases, the resulting intellectual property will be licensed broadly to the industry, and in some cases may end up getting turned over to existing standards bodies and other cross-industry organizations. Many Type 3 projects will not involve Microsoft directly, but rather a set of interested companies who will take advantage of the process tools and collaboration resources that the SDP will offer to drive a broad industry solution to a particular issue..." See also (1) the announcement: "Microsoft Announces Shared Development Process for Cooperation On Key Technology Initiatives. Company Issues Call to Industry to Join in Definition and Development of 'HailStorm' Services."; (2) "Microsoft Hailstorm."

  • [June 19, 2001] "Microsoft makes a push for .Net gains." By Joe Wilcox. In Yahoo News [CNET News.com] (June 19, 2001). "Microsoft on Tuesday solidified its software-as-a-service strategy, officially naming a forthcoming high-end version of Windows and releasing new tools for software developers. During a keynote speech at the TechEd 2001 conference in Atlanta, Microsoft Chairman Bill Gates also introduced the new Shared Development Process (SDP) program, supporting the company's .Net software services strategy. If successful, the program's working groups and other features could help Microsoft establish HailStorm, the first .Net offering, as the standard for delivering services over the Internet. . . In addition, Microsoft released the final beta, or test, version of software used by developers: Visual Studio.Net, which includes Visual Basic.Net and the .Net Framework. Visual Studio.Net is expected to provide important tools to support Microsoft's drive to move its Windows operating system and software to the Web. The development package includes updated versions of Visual Basic and C++ and adds the first version of C#, a software programming language designed to facilitate the building of Web-based software... In the longer term, SDP may be the more important announcement made by Microsoft on Tuesday, according to analysts. Through the program, Microsoft plans to establish working groups and industry dialog focused on .Net services, starting with HailStorm. Through HailStorm, which uses Microsoft's Passport authentication service, the company plans to provide secure, for-fee Internet services, such as e-mail, address lists and other personal data, to virtually any type of device. The technology uses XML (Extensible Markup Language), an HTML-like programming language for creating complex data delivered over the Web. But Microsoft has been advocating proprietary schemas, or XML vocabularies, that work better with its products. Microsoft's XML dialect would favor Windows and Office -- two products, according to Dataquest, that have a market share of better than 90 percent market. Microsoft could use its dominance in the mature markets as a lever for entering other, emerging markets, Sutherland said." See: (1) "The Microsoft Shared Development Process.", and (2) the announcement: "Microsoft Announces Shared Development Process for Cooperation On Key Technology Initiatives. Company Issues Call to Industry to Join in Definition and Development of 'HailStorm' Services."

  • [June 19, 2001] "BizTalk Automates B-to-B. [Review.]" By P.J. Connolly. In IT World (June 18, 2001). "Today's conventional wisdom holds that XML is the key to helping businesses work together, at least from the standpoint of merging information from disparate systems. But by itself, XML can't do anything to help. Someone has to define the extensions to the XML schema, the structure that the two partners are going to use when exchanging data. . . BizTalk Server 2000 is in some ways Microsoft's most ambitious product yet in terms of its effect on back-end operations. Most businesses that have streamlined their processes over time have done so internally with great success, but things often break down at the front door. Even the most successful EAI (enterprise application integration) or EDI (electronic data interchange) projects will have some sort of disconnect. BizTalk Server 2000 is constructed to remedy that situation by using SOAP (Simple Object Access Protocol) and XML to glue systems together electronically. It is a unique product that any business using EAI/EDI should consider. BizTalk Server is aimed at processing business documents, such as bills of lading, invoices, and purchase orders, as secured e-mail-like messages. These functions require sophisticated features such as document tracking and once-only delivery to provide the reliability needed for business-to-business transactions... The BizTalk Framework, although agnostic regarding message transport protocols, allows BizTags to carry transport-specific information. After it receives application-generated business documents, the BizTalk Server creates BizTalk messages that contain one or more BizTalk documents, which are generated either by the BizTalk Server or by the application used to create the original business document. The BizTalk Message is then sent to the partner's BizTalk Framework-compliant server which unwraps the message and passes it on to the partner's application... Three client-side tools that analysts and developers use to configure the data flow are included: BizTalk Editor, BizTalk Mapper, and BizTalk Orchestration Manager. Editor is used to create and edit XML schemas, whereas Mapper handles the XSLT (Extensible Style sheet Language Transformations) style sheets that convert data between XML schemas. Orchestration Manager, which uses Visio 2000, allows analysts to design a data flow and developers to translate that design into action. We found the BizTalk components easy to set up and use, and we were particularly impressed with BizTalk Orchestration Manager. We've used Visio before and have found it a great design tool, so we had little difficulty using it as the front end for Orchestration Manager. The GUI uses a Visio diagram split down the middle: Analysts create flowcharts on the left side, and developers, working on the right side, link the various functions from the flowchart to COM (Component Object Model) objects and message queues, also using the modified XML schemas as needed. BizTalk, with a little help from Visio's Visual Basic for Applications component, automatically applies the changes..." See "BizTalk Framework."

  • [June 18, 2001] "TechEd: Microsoft touts Web services support in .NET." By Tom Sullivan. In InfoWorld (June 18, 2001). "Microsoft demonstrated Web services support across several of its enterprise servers in the opening keynote of the TechEd developer's conference here Monday. Paul Flessner, vice president of .NET servers, tried to prove that Microsoft's server software products can compete against the traditional players in the market. Dan Kusnetsky, an analyst with Framingham, Mass.-based IDC, said Microsoft's capacity to compete in the enterprise has increased, particularly where enterprises prefer to string smaller servers together than to use a single machine. . . Also in the keynote, Flessner announced the availability of Mobile Information Server 2001 and showed how it can be used to deliver Exchange information to wireless handsets. Flessner brought product managers Don Kagan and Chris Ramsey on stage to demonstrate Content Management Server, which Microsoft acquired from NCompass Labs. Flessner also showed off the next generation of SQL Server, code-named Yukon, and explained its support for XML and the Common Language Runtime..." See the announcement: "Microsoft Drives XML Web Services Integration Through .NET Enterprise Servers. Company Announces Availability of Mobile Information Server, Demonstrates Content Management Server and Takes SQL Server Past 1 Billion Dollars."

  • [June 18, 2001] "Tech Giants Update E-Commerce Standard." By Stephen Shankland. From CNET News.com. June 18, 2001. "A gaggle of computing giants will release Monday a new version of a key Web standard that provides some common ground on how competitors such as Microsoft, IBM and Sun Microsystems view the future of the Internet. In September, Microsoft, IBM and Ariba proposed a standard called Universal Description, Discovery, and Integration (UDDI). The standard allows businesses to register with an Internet directory that will help companies advertise their services, so they can find one another and conduct transactions over the Web. The online yellow pages directory that UDDI provides is a key part of how 'Web services' plans such as Microsoft .Net and Sun One will work together despite corporate differences. Since last year, Sun, Hewlett-Packard, Oracle and others have joined the UDDI initiative, and the first working version of the UDDI directory was launched in May. But on Monday, the companies plan to announce the second version of the standard. The new version comes with several improvements. Among them is better support for different languages; more sophisticated searching features; the ability to describe company organizational structures such as divisions, groups and subsidiaries; and more specific business categories that companies can use to describe themselves... Registry services on the Internet are essential for Web services to succeed, and so far UDDI looks like the only option, said Gartner Group analyst Daryl Plummer. Plummber believes UDDI initially will be used in private arrangements among business partners -- for example, Home Depot could use a UDDI-based service that finds light-switch suppliers and ranks them according to pricing and availability of light switches. But UDDI faces a thorny issue: whether it will become an industry standard. Such a move would reduce the control the founding members have but could make UDDI more palatable to others by making it more neutral. UDDI organizers have said they plan to turn it over to a standards body, but that likely won't happen in the immediate future, Plummer said." See the 2001-06-18 announcement and references in "Universal Description, Discovery, and Integration (UDDI)"

  • [June 18, 2001] "A Topic Map Data Model. An infoset-based proposal." By Lars Marius Garshol (Ontopia A/S) and Hans Holger Rath. TMQL [Topic Maps Query Language] Project. Reference: ISO/IEC JTC 1/SC34 N0229. June 18, 2001. "This document defines an abstract model for topic maps which makes explicit the implicit data models of ISO 13250 and XTM 1.0. It also defines a processing model for XTM 1.0 based on the data model. The model is intended to present one possible approach to specifying a data and processing model for topic maps, believed by the author to be preferrable to other proposed approaches. It is hoped that this model may represent a first step on the way to a complete model for topic maps. Such a model would serve many purposes: (1) Enable interoperability between topic map processors by defining precisely what topic map processors are required to do. (2) Enable ancillary standards to be built on the topic map standard in a precise and controlled manner. (3) Make it easier for newcomers to topic maps to understand what their abstract structure is and how they work... This document is not complete; it is an early draft intended to show a possible approach to defining the topic map model. In particular, this document has no official standing whatsoever. It is, as stated above, just a draft proposal... The abstract model for topic maps here presented is inspired by the XML Infoset, and uses a similar system of information items with named and typed properties..." See: "(XML) Topic Maps." [cache, and alternate source, from Ontopia]

  • [June 18, 2001] "The Agricultural Ontology Server: A Tool for Knowledge Organisation and Integration." Food and Agriculture Organization of the United Nations (GILW), Rome. June 2001. "At FAO, we are committed to helping combat and eradicate world hunger. Information dissemination is an important and necessary tool in furthering this cause -- we need to provide consistent, usable access to information for users in places doing this very work. And, the wide recognition of FAO as a neutral international centre of excellence for agriculture positions it perfectly to lead in the development of system specific agricultural ontologies. The Agricultural Ontology Server (AOS) will be instrumental in this effort by structuring agricultural terminology, thus making describing, defining and relating this information manageable for distributed facilities, and by standardising agricultural terminology, thus making resource access and discovery more efficient. The AOS will function as a central common reference tool for serving ontologies. Itself an ontology using the AGROVOC thesaurus as its core, it will contain and serve terms, definitions of those terms and the relationships among those terms. It is designed to serve as a focal point for the vocabulary of the agricultural domain, and to codify and standardise the knowledge within this domain. It will serve common core terms and relationships, as well as richer relationships that designate it as an ontology... The elements of the AOS will need to be encoded within the RDF framework. Common terms and definitions and their associated relationships from the core of the AOS will be identified by Universal Resource Identifiers (URIs) and stored in this common framework. (XTM is a parallel standard in development that may provide richer associations for better encoding.) To enable the second task, the AOS will use XML language to communicate among systems for the exchange of the URIs to build ontologies. The systems interested in utilising the AOS will need to use this language to be capable of interoperability. The conjunction of these standards will enable the communication of machine-readable commonly used URIs among a variety of different tools. In the case of the AOS, this type of communication will allow ontologies created by multiple tools -- their terms, definitions and relationships -- to be shared, evaluated and maintained using the central AOS.... The advent of XML (eXtensible Markup Language) provides the ability to share knowledge across different tools, using a standard schema. The RDF (Resource Description Framework) standard allows storage and sharing of metadata (data about resources) across systems. The topic mapping language, XTM (XML Topic Maps), currently in development, may provide even stronger functionality for the use of metadata. These new standards allow us to leverage controlled vocabularies in the development of common methods for describing, defining and relating resources. Briefly defined, the Agricultural Ontology Server (AOS) will function as a central common reference tool for serving ontologies. An ontology is a system that contains terms, the definitions of those terms, and the specification of relationships among those terms. It can be thought of as an enhanced thesaurus -- it provides all the basic relationships inherent in a thesaurus, plus it defines and enables the creation of more formal and more specific relationships. The AOS, using the AGROVOC thesaurus as its core, is designed to serve as a central focal point for the vocabulary of the agricultural domain, and to codify and standardise the knowledge within this domain. It enables better communication within and across systems, and structures the meaning contained within systems..." See also "Draft Specification for DC-based Application Profile for Agricultural Information."

  • [June 15, 2001] "Sun Fortifies Java Development. Forte for Java 3.0 lets developers create, publish, and subscribe to XML-based Web services" By Ron Copeland. In InformationWeek Issue 841 (June 11, 2001), page 85. "If your company is a Java shop looking for tools to more easily build, assemble, and deploy enterprise applications as Web services, the latest version of Sun Microsystems' Forte for Java could help. It's one of the first development environments to offer such capabilities. Forte for Java 3.0 lets developers use Enterprise JavaBeans components not only to build enterprise applications, but also to create, publish, and subscribe to XML-based Web services. The development toolkit is available on the Web (eap.netbeans.com) as part of the Forte for Java Early Access Program. Forte for Java is a cross-platform integrated development environment for Linux, Solaris, and Windows platforms, and it's based on the NetBeans open-source development environment... One way in which the new release differs from the Forte for Java 2.0 is that it's based on the latest version of the NetBeans open-source project, which added a dozen or so modules to its code base. These modules simplify Java development and address a broad range of issues, including integration with Apache's Ant XML script tool, improved application-server support, and, perhaps most significantly, Simple Object Access Protocol-based Web-services generation and deployment. Highlights of Forte for Java 3.0 include wizards and templates for creating and packaging Enterprise JavaBeans and associated Web components. Java developers will be able to build sophisticated Web services applications rapidly, without the need for significant coding. Using an XML services registry, Java components are packaged as Web services for run-time access and execution..." See also the Sun feature article: "Forte ESP Toolkit Integrates Web Design Tools and XML Technologies."

  • [June 15, 2001] "Topic Maps, NewsML and XML-Possible Integration and Implementations." By Soelwin Oo (Software Developer, Research and Development, empolis UK). 2001. See the larger collection of technical papers. "This paper will discuss how the integration of different Topic Map based technologies can lead to the development of powerful knowledge based resource retrieval systems. It will discuss in detail the possible implementation for integrating a data resource that supports Topic structures with the knowledge embodied within a Topic Map. It will discuss this using examples of technology currently being developed by empolis illustrating the possible architecture of such a system and its potential real world use. Finally, the paper will investigate the potential for further integration and scalability of the system with other Topic Map resources. More specifically, it will elaborate on the possible hurdles and pitfalls that may arise from the integration of data from multiple resources and the possible need for managing ontologies originating from different sources... NewsML is a structured flexible framework based on XML developed by the IPTC (International Press Telecommunications Council) for electronic news based publication. It supports the representation of news items and the relationships between these news items in an XML based structure. Because NewsML possesses associated metadata concerning its news content, it provides the ability for having multiple representations of the same information along with provision for handling arbitrary mixtures of media types, languages and formats. The prime interest towards NewsML within the scope of Topic Maps is that NewsML possesses metadata concerning Topics that provide the ontology of its news content. This news item ontology' puts forward an appropriate example for an opportunity to capture' concepts presented by an XML based format that supports Topic structures. Once the base ontologies used within NewsML are present within a Topic Map, an application can process NewsML documents and present to the user the instances of the base ontologies that are associated with a NewsML document. This will then present a content driven approach for navigation of a Topic Map because the user's starting point will be the base ontologies instantiated by the NewsML document..." See "NewsML and IPTC2000" and "(XML) Topic Maps."

  • [June 15, 2001] "XAS: A System for Accessing Componentized, Virtual XML Documents." By Ming-Ling Lo, Shyh-Kwei Chen, Sriram Padmanabhan, and Jen-Yao Chung (IBM T. J. Watson Research Center). Paper presented at the Twenty Third International Conference on Software Engineering (ICSE 2001). May 12-19, 2001. Published in the conference proceedings, pages 493-502 (with 26 references); available from the IEEE Computer Society. "XML is emerging as an important format for describing the schema of documents and data to facilitate integration of applications in a variety of industry domains. An important issue that naturally arises is the requirement to generate, store and access XML documents. It is important to reuse existing data management systems and repositories for this purpose. We describe the XML Access Server (XAS), a general purpose XML based storage and retrieval system which provides the appearance of a large set of XML documents while retaining the data in underlying federated data sources that could be relational, object-oriented, or semi-structured. XAS automatically maps the underlying data into virtual XML components when mappings between DTDs and underlying schemas are established. The components can be presented as XML documents or assembled into larger components. XAS manages the relationship between XML components and the mapping in the form of document composition logic. The versatility in its ways to generate XML documents enables XAS to serve a large number of XML components and documents efficiently and expediently."

  • [June 15, 2001] "Draft requirements, examples, and a 'low bar' proposal for Topic Map Constraint Language." By Steve Pepper (Project Editor). ISO/IEC JTC 1/SC34 N226. The User Requirements include: (1) TMCL shall permit the definition of classes of topic maps in order to: [a] enable the documentation of the structure and semantics of a class of topic maps; [b] provide a foundation for defining vertical or domain specific applications of topic maps; [c] provide means of validation to ensure consistency within a topic map or across a class of topic maps; [d] enable applications to provide easier and more intuitive user interfaces for creating and maintaining topic maps; [e] enable the separation of the tasks of modeling and populating topic maps. (2) TMCL shall be based on the Topic Map Data Model (and therefore support both XTM and ISO 13250 Topic Maps). (3) TMCL shall not attempt to cover every possible constraint. Instead it should provide a solution for the most commonly required kinds of constraints and, at the same time, an extension mechanism to allow the expression of less common constraints by other means. (4) TMCL shall provide for modularization, and the ability to extend individual sets of constraints through reference to others. (5) TMCL shall be expressible as XML, using the topic map interchange syntax where applicable. (6) TMCL shall build on pre-existing specifications and established best practice for knowledge representation and data modeling where possible. (Candidates for consideration include DAML/OIL, KIF, OKBC, OCL, PAL (Protégé Axiom Language), and XML Schema.) (7) TMCL shall be as concise and human-readable as possible within the terms of the preceding requirements." Cf. the NWI proposal cited below. From the Recommendations of May 2001 Meeting of ISO/IEC JTC1/SC34/WG3 in Berlin: "WG3 submits N221 as a New Project Proposal for a Topic Map Constraint Language to support ISO/IEC 13250. SC34 requests its secretariat to forward this document to JTC1 for ballot. WG3 submits N226 as draft requirements, proposes Steve Pepper (Norway) as acting editor and instructs the acting editor to prepare a final requirements document." See: "(XML) Topic Maps."

  • [June 15, 2001] Topic Map Constraint Language [TMCL]. Proposal For a New Work Item. ISO/IEC JTC 1/SC34 N221. 23 May 2001. Motivated because "a constraint language is needed to build templates for topic maps conforming to ISO/IEC 13250." The new work would address "mechanisms for expressing constraints on classes of topic maps conforming to ISO/IEC 13250:2000." Purpose and justification (1) To enable the documentation of the structure and semantics of a class of topic maps. (2) To provide a foundation for defining vertical or domain specific applications of topic maps. (3) To provide means of validation to ensure consistency within a topic map or across a class of topic maps. (4) To enable applications to provide easier and more intuitive user interfaces for creating and maintaining topic maps. (5) To enable the separation of the tasks of modeling and populating topic maps... This project will be part of a series of Standards and Technical Reports that contribute to the implementation and understanding of ISO/IEC 13250, Topic Maps." Compare the SC34 N226 Draft Requirements for TMCL, cited above. See: "(XML) Topic Maps."

  • [June 15, 2001] "XML-Lit." A communique from Rafael R. Sevilla: "I've started a new XML literate programming project I call XML-Lit.... 'XML-Lit: A simple XML-based literate programming system' This project is somewhat inspired by a very simple program by Jonathan Bartlett called xmltangle. XML-Lit is a simple literate programming system that you can use with any XML-based markup language to make your literate program..." From the introduction to the documentation: "I recently found a simple program called xmltangle by Jonathan Bartlett that provides a simple literate programming system based on DocBook. I have been somewhat frustrated by that program though; for one thing, it did not allow program code snippets to be enclosed within CDATA sections, which would make including a program inline a lot easier to do, and easier to read on screen while you're editing it, especially with programming languages that are chock full of <'s such as the typical C program, or worse yet, an XSL stylesheet, which I planned to use Jonathan's program for. So I set off to create a complete rewrite of the program, which uses James Clark's expat XML parser. So now, I have come up with my own simple literate programming system, xml-lit which takes a similar approach, but instead of enclosing code snippets within within DocBook <programlisting/> tags, I define a new namespace xml-lit which (for now) contains a single tag <xml-lit:code> which has a single attribute named file which gives the name of the file to which the code it encloses should be output. This eliminates the program's dependency on DocBook, so it can be used with any XML-based document markup language (such as XHTML). It's a very simplistic system, but it's able to do the task for which it was designed. The program is also backward-compatible with Jonathan's work given a command line switch..." See the source code download and online documentation. See "SGML/XML and Literate Programming."

  • [June 15, 2001] "Three Myths of XML." By Kendall Grant Clark. From XML.com. June 13, 2001. ['XML has it all, not only an interoperable syntax but a solution to bring world peace, end poverty and deter evil dictators. Kendall Clark debunks these and other popular myths of XML.'] "... The possibilities of social change brought about by technology are limited as much by the social and historical contexts within which technology comes into existence as they are by intrinsic features of the technology itself. This general point is perhaps never so true as when it's applied to two specific areas of computer technology, both of which concern XML.com readers directly: the Semantic Web and, of course, XML. In what follows I debunk three myths of XML, each of which in some way bears on the question of the role of technology in social change: (1) The first myth rests on a confusion about the meanings of words like 'free' and 'open' when they are applied to XML-encoded information. (2) The second myth is that XML is magical, that it has some unique properties that makes impossible things possible. (3) The third is that technology, including XML, is more determinative of social relations and institutions than they are of it..."

  • [June 15, 2001] "Perl and XML: Perl XML Quickstart: Convenience Modules." By Kip Hampton From XML.com. June 13, 2001. [' The third and final part of our guide to Perl XML modules covers some handy modules geared to specific tasks.'] "This is the third and final part of a series of articles meant to give quick introductions to some of the more popular Perl XML modules. In the last two months we have looked at the modules that implement the standard XML APIs and those that provide more Perlish XML interfaces. This month we will be looking at some of the modules that seek to simplify a specific XML-related task. Unless XML is a significant part of your daily life, chances are good that the more generic XML API modules will seem like overkill. Perhaps they are. If your needs are modest, a module probably exists that will reduce your task to a few method calls. These single purpose, convenience modules are a key entry point to the Perl/XML world, and I have chosen a few of the more popular ones for this month's code samples. In the interest of clarity, we will limit the scope of the examples to the common tasks of creating XML document for other data sources, converting HTML to XHTML, and comparing the contents of two XML documents. While many of the XML API modules provide a way to create XML documents programmatically based on data from any source, several modules exist that simplify the task of creating XML documents from data stored in other common formats. We'll illustrate how to create XML documents based on data extracted from CSV (Comma Separated Value) files, Excel spreadsheets, and relational databases..." See: "XML and Perl."

  • [June 15, 2001] "X Marks (up) the Language." By Eric Bohlman. From XMLPerl.Com June 2001. ['Second in a series of articles written by Eric Bohlman. This article gives a good overview of how to parse XML with Perl, and almost as important, how NOT to parse XML.'] "When we talk about parsing a language, we mean the process of taking a piece of code or data written in that language and breaking it up into its constituent parts as defined by the rules of that language. Parsing is an essential task for any program that wants to use language- based data or code as input. .. There are basically two ways a parser can make the components of an XML document known to an application: it can read through the document and signal the application every time a new component appears, or it can read the entire document and then present the application with a tree structure corresponding to the element structure of the document. A parser that works the first way is called a stream-based or event- driven parser; one that works the second way is called a tree- based parser. Two common terms that you'll hear are SAX and DOM; SAX (Simple API for XML) is a specification (developed informally by members of the xml-dev mailing list) for how a stream-based parser should "talk" to an application; DOM (Document Object Model) is a specification (a formal Recommendation of the W3C) for how an application can access and manipulate the tree structure of a document. Whether to use a stream-based or tree-based parser depends on the nature of the processing being done to the XML documents and the size of the documents. A tree-based parser usually has to load the entire document into memory, which may be impractical when processing documents like dictionaries or large database dumps. With a stream-based processor, you can skip over elements that you aren't interested in (for example, when looking up a particular word in a dictionary). But if your application needs to process certain elements in relation to other elements (for example, reading a bibliography and extracting a list of all authors who have published at least three articles on the same topic), a tree- based parser is much easier to work with. It's worth noting that a tree-based parser can be built on top of a stream-based parser, and that the output of a tree-based parser can be "walked" to provide a stream-based interface to an application. As of this writing, all the Perl tree-based parsers are of the former type. Perl Modules for Parsing XML As of this writing, there are four "mainstream families" of XML parsers available as Perl modules, all of which are available from CPAN: [XML::Parser, XML::DOM, XML::Parser::PerlSAX, XML::Grove] All of these modules provide object-oriented interfaces; if you're not comfortable with object-oriented programming in Perl, now is the time to review the perltoot and perltootc manpages that come with Perl. If you have ActiveState's ActivePerl for Win32, you already have XML::Parser installed, since ActivePerl's PPM utility uses XML documents to store the installation requirements for modules... Next month we'll continue talking about XML parsing and we'll look at tree-based parsing." See: "XML and Perl."

  • [June 15, 2001] "XML-Deviant: What You See Isn't What We Want." By Leigh Dodds From XML.com. June 13, 2001. ['Getting back to basics, we take a look at the best way of getting your documents marked up in XML.'] "'How do I convert my Word documents to XML?' This one has cropped up in several forums and appears regularly on XML-DEV. And, like many seemingly simple questions, there are a variety of answers. If the intention is to simply convert a small number of documents to XML or HTML suitable for publishing on the Web, then using the built-in Save As XML/HTML option is a good starting point. But the results of this are messy to say the least. A great deal of Word specific cruft is left in the resulting document. This has lead to the production of numerous tools capable of cleaning up the mess, as well as others, like Omnimark, that provide an alternative conversion facility. In some cases what is being asked is much more ambitious. Users may have a large number of documents that must be converted, and they may want to continue to use Word as an authoring tool for the generation of structured XML documents conforming to a particular schema, for use in a publishing system, or document repository. These are the users who are plainly keen to gain some of the widely advertised advantages of XML by moving their documentation out of a proprietary format. In these circumstances it seems that the received wisdom is to roll-up your sleeves and begin coding. The key technique is to use Word Styles (user-defined formatting properties) as markers for particular document structures (paragraphs, lists, headings, etc.) and then use scripts or macros to generate markup based on this styling information. Further manipulation with XSLT, for example, can further refine the results to yield the desired format. Rather surprising for users who may be seeking an off-the-shelf solution..."

  • [June 15, 2001] "Interwoven aims to rally XML migration. Repository allows incremental conversion from HTML." By Cathleen Moore. In ITWorld (June 14, 2001). "Interwoven Inc. rolled out two content management infrastructure products on Tuesday aiming to boost enterprise control of content reuse and distribution. The company's TeamXML is an XML repository designed to help enterprises adopt XML. By allowing users to convert individual Web assets and content components to XML on an as-needed basis, the repository allows enterprises to implement a phased XML migration strategy, according to Interwoven officials. According to Kevin Cochrane, vice president of product management at Sunnyvale, California-based Interwoven, allowing corporate users to control the migration process will help speed the adoption of XML. TeamXML also offers the ability to store XML objects in native form, which boosts the performance and scalability of content, Interwoven officials said. According to Rob Perry, senior analyst at The Yankee Group in Boston, native storage for XML may help drive adoption in companies that are trying to make the move to XML. Interwoven also released OpenSyndicate, a content distribution product aimed at giving enterprise users the ability to control the assembly of content packages..." See details in "Talking XML with Mark Hale, Standards Architect, Interwoven." See also the recent announcements from Interwoven: (1) "Interwoven Announces TeamXML. Next-Generation Object Store to Accelerate Adoption of XML Across the Enterprise. Interwoven Extends XML Leadership with Architecture based on Native XML-Object Model" and (2) Interwoven Announces OpenSyndicate. Business Managers To Take Direct Control of Content Distribution. Interwoven Pioneers Intelligent Content Distribution."

  • [June 15, 2001] "Inside UDDI." By Richard Karpinski. In InternetWeek (June 07, 2001). "Later this month, UDDI.org will unveil version 2.0 of its specification for helping companies find each other via the Internet. With the backing of more than 280 companies, the Universal Description, Discover and Integration Registry looks to have staying power. Yet many enterprises haven't even started to tap its power yet. We spoke with Chris Kurt, the program manager for UDDI.org (and Microsoft's group program manager for Web services) to get a nuts-and-bolts look at how UDDI works -- and how IT can get started using it today. . . UDDI provides an XML-based method for businesses to describe themselves and the Web-based services they offer. The UDDI Business Registry is the public database where companies register themselves. Public UDDI registries are now fully operational. Beta testing wrapped up in early May. IBM and Microsoft are running the public databases. Ariba dropped out, but Hewlett-Packard will launch a third registry later this year. The power of UDDI is the power of ad-hoc discovery of new business partners and processes. If the emerging world of Web services is to flourish, companies need a seamless, automated way to find other businesses on the Internet and determine if their systems and applications are able to work together via the Web. In short, UDDI lets companies do three things: (1) Discover each other; (2) Define how they can interact via the Internet; and (3) Share all this information via an open, global registry... UDDI is a good example of what happens when developers begin thinking about delivering apps as services. The registry is lightweight (it doesn't hold information but links to it); message-based (connections are made by passing XML documents rather than hard-coded integration); and supports highly-distributed apps (even though the look-up database itself is centralized in several locations). Today, UDDI requires too much manual work. The true power of UDDI will come when development tools automatically create the WDSL files to describe newly-created apps and delivers them seamlessly to the public UDDI databases. Also important will be UDDI links within key enterprise apps, such as ERP, supply chain and procurement. Such apps should one day be able to expose the Web services they offer as part of their installation process. UDDI is all about ad-hoc business relationships -- 'discovery,' as it name implies. To that extent, long-time business partners may share their Web services more directly. But as e-business grows, says UDDI.org's Kurt, companies will regularly be evaluating new suppliers, as well as seeking an automated way to learn about the new Web services and interfaces exposed by existing trading partners. Public UDDI registries augmented with private supply community UDDI databases should be able to take care of this gamut of e-business relationships. Meanwhile, version 2.0 of UDDI -- slated to be unveiled this month -- will among other improvements support richer taxonomies to better reflect the complexity of enterprises and the different types of Web services they aim to describe. There's no doubt that UDDI -- and the Web services model it aims to support--is in its infancy. But so far the UDDI.org group has moved quickly toward public implementations and kept the politics at a minimum." See: "Universal Description, Discovery, and Integration (UDDI)."

  • [June 15, 2001] "Microsoft Brings Keyword Search to UDDI." By Ashlee Vance. In InfoWorld (June 15, 2001). "Microsoft and Realnames teamed on a keyword-based searching service Thursday for the UDDI registry, adding one of the first new features to a directory that has been billed as a 'Yellow Pages' for the Internet. The UDDI (Universal Description, Discovery, and Integration) registry aims to make it easier for businesses to provide information about their products and services on the Web as well as locate partners and customers. A number of registries that use differing protocols already exist on the Web, but Microsoft, IBM, and Hewlett-Packard have joined the UDDI effort as a way to make business-to-business commerce on the Web work more smoothly. The vendors claim that thousands of businesses have signed up to use UDDI. Microsoft maintains one of the registry sites where companies can enter information about their business. The software maker is teaming with RealNames to make UDDI-related keywords accessible through the address bar in the Internet Explorer browser, said Christopher Kurt, group program manager for UDDI and Web Services at Microsoft. RealNames removes the need to type in sometimes hard-to-remember Web addresses by allowing companies to register simple keywords -- such as the name of a company or a product. When a user types in one of those keywords, they are taken to the Web sites of the company that registered the word, Kurt said. The system competes with a similar keyword service operated by America Online. In the context of UDDI, users will be able to type UDDI followed by a company name or portion of a company name into the address bar of Internet Explorer -- for example, UDDI flowers" The results would show a list of the businesses registered in UDDI that have flowers in their name. The service could be used by anyone, from a home user shopping for a cricket bat or a large manufacturer in need of raw materials, officials said. The keyword search will also take into account a user's location, returning searches based on the language spoken in the user's locale. When businesses sign up to use the RealNames service, they will be pointed to Microsoft's UDDI registry site in an attempt to encourage growth of the registry. Eventually, users will be able to submit their information to the UDDI registry directly from the RealNames site, said Nico Popp, chief technology officer of RealNames. The UDDI system, which was launched last month, contains three types of information, divided into what the vendors refer to as White, Yellow, and Green pages... Microsoft, IBM, and HP will maintain the servers that collect the registry information for about the next year, at which time the project will be turned over to an as-yet unnamed standards body. Updates to the registry are scheduled to appear throughout 2001, with more complex features being added for varying types of business-to-business transactions. Companies can register their information in the UDDI registry at no charge." See (1) the main reference page "Universal Description, Discovery, and Integration (UDDI)", and (2) the announcement: "Microsoft and RealNames Announce Registration And Navigation Services for UDDI Initiative. Businesses Publish UDDI Records and Receive Worldwide Exposure Through Internet Explorer Browser When Registering Keywords."

  • [June 13, 2001] "XML Takes Root in Catalog, Database Publishing. XML becoming ubiquitous as SGML never could. [The Latest Word.]" By George Alexander. In Seybold Report: Analyzing Publishing Technology [ISSN: 1533-9211] Volume 1, Number 5 (June 04, 2001), pages 27-28. "Tools for publishing catalogs and database-derived publications (directories, reference books, etc.) depend on a combination of database and layout tools. Vendors of such publishing tools have latched onto XML enthusiastically and are finding all kinds of uses for it. XML-based approaches vary a lot in their complexity, ranging from support for simple file exchange to the use of XML internally. Recent announcements from some catalog and database-publishing vendors give an idea of some of the possibilities... It's striking that so many diverse uses for XML have surfaced in this single group of related applications. Clearly, there was an existing need for a file format that provides the combination of structure and flexibility found in XML. Within the next year or so, we expect that virtually every catalog and database will offer some support for XML import and export. It's a logical thing to do, and customers will be asking for it. But note that 'XML support' will not generally mean 'no custom conversion needed' when transferring data to another system. With the exception of a few selected vertical markets, there are no standards for exactly how the XML file coming out of or going into a database should be tagged. This means custom conversion routines (albeit relatively simple ones) will still be needed in most cases. There will be room for additional development in this area in the years to come. Other less obvious uses for XML, such as the API approach that Boheads is using, or the internal use of XML by Datazone, will no doubt continue to surface over the next few years as well..." [The article summarizes a variety of ways XML is used by publishers in document- and database-oriented applications.]

  • [June 13, 2001] "CDC Solutions integrates, extends with Xtensia. Experienced U.K. vendor pitches automated personalized document production as an enterprise capability." By Mark Walter. In Seybold Report: Analyzing Publishing Technology [ISSN: 1533-9211] Volume 1, Number 5 (June 04, 2001), pages 25-26. "Earlier this month, database-publishing veteran CDC Solutions announced plans to offer its multichannel publishing products in a configuration consistent with the enterprise application integration platforms that are emerging in the industry. Called Xtensia, the new platform is built upon the company's existing product modules and expertise in on-demand, customized PDF and XML-based publishing. What's new is the way it expects to see customers integrate the technology with other business systems in enterprise-level applications as well as in departmental point solutions. With Xtensia, CDC is applying its skill in aggregation, personalization and production to material that is structured but unformatted (or at least not frozen as PDF). It offers prebuilt product modules for specific tasks and is in the process of creating standard interfaces for those products to facilitate their integration with other systems. The standard product modules are in the areas of aggregation, personalization and production... CDC has written a new multipage composition engine based on Adobe's PDF libraries. Called Xssembler, it takes well-formed or valid XML as its input and creates composed PDF files as output. Typically Xssembler takes advantage of aggregation and personalization features in specific applications. For example, in the legal department at one of CDC's clients, Xssembler creates personalized contracts based on boilerplate material and variable-data fields. The resulting documents may then be sent to the next step in the cycle -- back to a Web page, on to an e-mail server or directly to a printing device. Where authenticity is a concern, CDC has watermarking capability that it can apply as part of the process... One downside of the XML-based Xssembler is that it is not yet as designer-friendly as the PDF-based product. Where PDFfusion enabled firms to create their PDF overlays using any layout tool they wanted, Xtensia is a command-line program that currently lacks a graphical tool for creating its layouts -- designers end up writing XSL style sheets using ASCII editors. Jim Cook, CDC's chief technology officer, admitted that 'There is a gap in the market right now for a good, graphical style sheet editor,' and CDC has not yet developed its own, expecting that someone will fill that gap in the very near future... CDC's concept -- a network agent that compiles and composes custom documents -- sounds just like dynamically built Web sites, but with a key difference: Xtensia can make good-looking printed documents, which precious few Web production systems do."

  • [June 13, 2001] "New Contenders in Cross-Media Publishing Systems." By Stephen Edwards, with Luke Cavanagh, and Mark Walter. In Seybold Report: Analyzing Publishing Technology [ISSN: 1533-9211] Volume 1, Number 4 (May 21, 2001), pages 4-13. "Two significant new players in the cross-media system business, Seinet [Xtent] and EidosMedia [Méthode], made their debut at Seybold Seminars Boston last month. Their key innovations include the use of XML to support media-neutral structure with media-specific decisions. They are developing the capability to write a story to fit a space, in print or on a Web page, without sacrificing its reusability. We view them as the first of a new breed of editorial content-management systems that have cross-media awareness built in. Newspapers are the target customers, but catalog and journal publishers will also want to take note... A significant difference between these [two] new systems and the old ones is their use of XML to have both well-formed, normalized structure and media-specific decisions and intent. They give editors and designers interactive feedback regarding how the content will appear when delivered, regardless of delivery mechanism. They bring XML up front to the authoring process, rather than downstream as a post-conversion process, such as is offered with Avenue.Quark. This has been done many times before in content-driven publications. Unfortunately, because most previous SGML and XML-based systems were designed for content-driven publications, they negated the importance of media-specific and product-specific decisions, and so skimped on visual feedback and tools for media-specific decisions. Even though the Seinet and EidosMedia systems are aimed initially at newspapers, they parallel the future direction for cross-media systems for catalogs, journals and a variety of other publishing genres. . . The XML implementations of these two companies provide an important benefit we want to elaborate on here: the ability to create only one version of each article and publish it in multiple editions for multiple media. This feature -- attempted also by Atex with Omnex -- breaks from the traditional approach of newspaper systems, which has required the user to create a new version of a story every time it is edited for use in a different edition. In contrast, both Seinet and EidosMedia can use a single file for all editions, with XML tags defining which portions of a story will be published in each edition. For example, if different headlines are required for the Web and print editions, both headlines are contained in the same file, and on output the XML tags determine how the headlines are used. Similarly, a summary of a story to be used in a digest on the Web can be included right in the story. And, where hard-core newspaper editors like to edit a story for each use, a single story can encompass all such changes. This approach offers several important conveniences, besides simplifying the process of searching for specific versions. It reduces the amount of editing, cutting and pasting among files as multiple versions are created. And, when last-minute changes are required, such as in correcting errors, it speeds the process and promotes accuracy by limiting the changes to a single occurrence... The innovation of Seinet, EidosMedia and other newspaper vendors is in applying XML to layout-intensive applications, where editors and designers make editorial decisions, such as the wording of a headline -- in the context of specific layouts. To do this, they've written their own XML editing programs that allow media-specific markup to be inserted in a way that does not corrupt the cross-media structural markup of the article. The implementations represent a leap for XML into a whole new market for publishing system."

  • [June 13, 2001] "W3C Blesses XML Schema. Milestone passed for building XML-enabled Web applications, services and technologies. [Standards.]" By Mark Walter. In Seybold Report: Analyzing Publishing Technology [ISSN: 1533-9211] Volume 1, Number 4 (May 21, 2001), page 3. "After two years of development, refinement and testing, the W3C has approved XML Schema 1.0 as a recommendation, marking a major milestone in the evolution of XML. The new standard creates an underlying common syntax for data interchange and technology integration on the Web, complementing the document-oriented features of XML itself... XML Schema makes two substantial improvements to XML: datatypes and integration with namespaces. Bringing datatypes to XML makes it more suitable for use with databases and the fielded information -- identifiers, numbers, dates, etc. -- often stored in databases. The integration with XML Namespaces makes it easier to validate and resolve documents that make use of multiple tag vocabularies. The W3C has issued a tool as well, an XML Schema Validator called XSV that it co-developed with the University of Edinburgh in Scotland. XSV has been revised at each stage of XML Schema development and now validates against the final spec. In addition, the W3C is inviting developers to send in sample schemas for a test-suite library, to be reviewed and managed by the W3C XML Schema Working Group. . . The passing of XML Schema by the W3C may have been a foregone conclusion, but it nevertheless represents an important milestone. Users can now rightfully demand that vendors comply, and developers have a basis for creating Web services and technologies that exchange data in an open fashion. As with most other HTML and XML standards, we like what's emerged from the W3C committee process, and we expect most vendors to nod their heads in agreement. True compliance, however, will be up to end users to demand and enforce."

  • [June 13, 2001] "NetLibrary Adopts OEB. Drops proprietary format and conversion services, cuts 90 jobs." By Mike Letts. In Seybold Report: Analyzing Publishing Technology [ISSN: 1533-9211] Volume 1, Number 4 (May 21, 2001), pages 42-43. "Last January, in a move designed to cut costs, online library service provider NetLibrary began charging publishers for the free conversion services it originally provided. With sales sputtering, the company found that its inhouse conversion services were a financial liability. Now, less than three months later, NetLibrary has scrapped all on-site conversion services for its NetLibrary service and cut nearly 90 jobs in the process. The conversion policy for MetaText, an interactive digital textbook developer acquired by NetLibrary last year, will remain unchanged. In addition, the company also announced that it has dropped its use of a proprietary format based on Folio Views in favor of the XML/HTML-based publication structure developed by the Open e-Book Forum (OeBF, www.openebook.org) consortium. Under the company's new service policy, publishers can submit any electronic file meeting OeBF standards, or NetLibrary will outsource any necessary conversion labor to another facility and act as middleman between the publisher and the conversion house..." See "Open Ebook Initiative."

  • [June 13, 2001] "Webjaz in at the Examiner. Two modules used in daily production." By Stephen Edwards. In Seybold Report: Analyzing Publishing Technology [ISSN: 1533-9211] Volume 1, Number 4 (May 21, 2001), page 36. "Harris' Jazbox publishing platform has been installed at the San Francisco Examiner where the paper is using two of the system's key modules: Newsjaz, for print publishing, and Webjaz, for Web publishing. The Examiner Web site provides evidence that Webjaz works. Harris reported that the site is being handled by one technical person, compared with 11 that it needed prior to being split off. Harris has continued to add to the Webjaz functionality, resulting in a solid product. It handles all file types, stores all elements in XML (the text-editing window doesn't show XML tags, but a separate preview window displays them), and supports XSL style sheets. It provides a scheduling facility for automatically publishing a story at a specified time and automatically archiving it later. It offers automatic e-mail messaging and supports version control, audit trails and database replication. If different versions of a story are required for different media, Harris creates multiple versions rather than using the same file... For print publishing, Harris has been focusing on the integration of InDesign with Newsjaz, which was started at the request of the Examiner.

  • [June 12, 2001] "Archaeological Data Models and Web Publication Using XML." By J. David Schloen (The Oriental Institute of the University of Chicago). In Computers and the Humanities (CHUM) Volume 35, Number 2 (May, 2001), pages 123-152. "An appropriate standardized data model is necessary to facilitate electronic publication and analysis of archaeological data on the World Wide Web. A hierarchical 'item-based' model is proposed which can be readily implemented as an Extensible Markup Language (XML) tagging scheme that can represent any kind of archaeological data and deliver it in a cross-platform, standardized fashion to any Web browser. This tagging scheme and the data model it implements permit seamless integration and joint querying of archaeological datasets derived from many different sources... see http://www-oi.uchicago.edu/ for the latest version of the XML Document Type Definition in which ArchaeoML is defined..." See also from "Electronic Publication of Ancient Near Eastern Texts": "David Schloen, an archaeologist in the University of Chicago's Oriental Institute, gave the final formal presentation on Saturday afternoon, entitled 'Texts and Context: Using XML to Integrate and Retrieve Archaeological Data on the Web.' Schloen noted that XML is as suitable for representing archaeological databases as it is for representing ancient texts. But whether the information is expressed in XML or in some other data format (e.g., a relational database), archaeologists need an appropriate data model that captures in a rigorous and consistent fashion the idiosyncrasies of units of archaeological observation, as well as the spatial and temporal interrelationships among them. Schloen proposes a hierarchical, 'item-based' data model, rather than the 'class-based' (tabular) data model which currently prevails. The item-based data model has the advantage of being straightforwardly represented in XML as a nested hierarchy of tagged elements with their attributes. Moreover, texts can be treated like any other type of artifact, as items in a spatial hierarchy with their own properties. Schloen concluded by presenting an XML tagging scheme dubbed ArchaeoML ('Archaeological Markup Language') which can represent any kind of archaeological data on any spatial scale, including the vector map shapes and raster images which belong to individual archaeological items..."

  • [June 12, 2001] "Forte ESP Toolkit Integrates Web Design Tools and XML Technologies." From Sun Microsystems. Feature story. June, 2001. "A new feature in the Forte for Java, release 3.0 Early Access software is the Forte for Java Enterprise Service Presentation Toolkit (ESP). This toolkit contains a set of enabling tools that simplify the development of Java 2 Platform, Enterprise Edition (J2EE) Web applications by integrating popular Web design tools and XML technologies. The toolkit is primarily directed at the application presentation layer, for example clients such as browsers, cell phones, and PDAs. Data sources can be any servlets, pages derived from JavaServer Pages (JSP) technology, JavaBeans, or Enterprise JavaBeans (EJB) components that deliver data as XML documents. The Forte ESP Toolkit offers the following benefits: (1) The roles of Web designer and programmer are clearly separated. (2) Web designers can author JSP pages that access dynamic XML data. (3) A single JSP page can access multiple data sources. (4) A single XML data source can be reused for multiple device types... The Forte ESP registry serves as an interface between the Web design and the programmer creating the back-end. The registry contains information about the location of XML data sources and the structure of the data, which can be entered in the registry by a programmer. The Forte ESP Toolkit also offers Web design tool extensions and a servlet that together read the registry, analyze the XML data, and give the Web designer a way to graphically map data to a page layout. The extensions insert custom tags in a JSP page to access the data dynamically at runtime. The design-time data analysis can use a sample, static XML document. This sample enables Web designers and back-end programmers to work in parallel based on their agreed data structure. Thus, the Web designer can do all the page layout, while the XML data source is still in development. This makes the whole team more productive, and the project is completed faster.The Forte ESP Toolkit also enables Web designers working with a single XML data source (created by a programmer) to generate JSP pages that incorporate XML data into HTML documents for browsers, cell phones, and PDAs. The Forte ESP Toolkit provides a JSP custom tag library that supports the embedding of Extensible Style Language (XSL) transformations (XSLT) in a JSP page. When one of these tags is called at runtime, the XSL processor transforms the XSL source document into the HTML needed by the presentation device. The programmers on a project can produce a single set of XSL sources, and the Web designers can use the Forte ESP Toolkit to automatically map those sources to the desired presentation device types and display formats..." See (1) the User's Guide, and (2) the overview in "Creating Web Services with Java Technology and XML."

  • [June 12, 2001] "TMQL Requirements (0.8.2)." Edited by Hans Holger Rath (empolis GmbH) and Lars Marius Garshol (Ontopia). "This document sets down the requirements that will guide the work with the Topic Map Query Language (TMQL), a query language for topic maps. The requirements herein presented document the intentions of the standards editors, as informed by the user community. Its purpose is to make it clear what can be expected to come out of the TMQL process, and to encourage the user community to make their needs known to the editors. This document has requirements for the TMQL standard as a whole, and for the query part of TMQL in particular. Additional requirements for the update part of TMQL will have to be defined at a later stage..." [Referenced in posting from Lars Marius Garshol. "WG3 proposes Hans Holger Rath (Germany) and Lars Marius Garshol (Norway) as editors of Topic Map Query Language (TMQL) and instructs them to produce a final requirements document for TMQL, and to prepare a response to comments from the National Bodies of the UK, US, and Japan.. At the ISO SC34 meeting in Berlin in May I officially replaced Ann Wrightson as co-editor of the TMQL standard with Hans Holger Rath. Based on the previous requirements document put together by Ann and Holger, as well as the discussions in Berlin, Holger and I produced a new TMQL requirements document. This document presents the editors' views on what the TMQL standard should be like and how it should relate to other standards. The editors would very much like to see feedback from the topic map community on these requirements, in order to ensure that the editors and the community are in agreement on the requirements to be fulfilled by the standard before work begins in earnest..." See: "(XML) Topic Maps."

  • [June 12, 2001] "State Courts Look to Pass Judgment on XML. Document-encoding technology seen by some in legal community as key to electronic filing services." By Ellen Messmer. In Network World Volume 18, Number 23 (June 04, 2001), page 10. "Lawyers, courts and legal cases generate mountains of paperwork, but a few states have taken the ground-breaking step to allow electronic filing of documents directly to court Web sites for processing over their intranets. While e-filing is catching on in states such as Georgia, New Mexico, California and Washington, the process of managing legal documents online raises thorny questions about the need for signatures, common security practices and technical standards for interoperability in document exchange. Counties today take varying approaches to e-filing, but there is a growing consensus that the document-encoding technology called XML can be the basis for statewide - and perhaps even nationwide - electronic filing. Georgia has led the charge, as its judiciary and universities have devised an XML tagging specification for the courts dubbed Legal XML. The specification will go on trial next week as four Georgia courts and four e-filing services show how it can be used to transmit XML-based documents to court servers and to competing e-filing services. These courts and document clearinghouses today can't easily share electronic documents. But the use of format-neutral XML tags encoded around content is expected to make it easier to process information received over the Internet as long as the application server receiving it supports XML, too... Georgia hopes to complete the testing of Legal XML by August, and if it works out, it's likely to be required for use in courts statewide. In addition, backers of Legal XML formed a nonprofit organization last winter (see www.LegalXML.org) to promote it as a national standard... 'The XML language is the most powerful I've seen to help us accelerate use of e-filing,'" says Bob March, clerk of court at the U.S. District Court in New Mexico, which has used e-filing for about three years. The New Mexico court is redesigning its court management system to support XML. The court in Albuquerque has a T-1 line for receiving legal documents processed through the @court hosted service for receipt by 14 judges..." "Legal XML Working Group."

  • [June 12, 2001] "Cool Graphics in XML." By Mark Gibbs. In Network World Volume 18, Number 23 (June 04, 2001), page 56. "The entire universe is going XML. Everywhere you turn, somebody is turning something into XML. This is, as Ms. Stewart is wont to say, 'A good thing.' The beauty of XML is it provides a whole new way of structuring 'stuff' that goes beyond just organizing data. With XML, data is imbued with meaning and purpose . . . wait a minute, that implicitly makes it information. Cool. Today, we'll look at one of the latest and most promising applications of XML - Scalable Vector Graphics (SVG)... SVG is terribly exciting if you're inclined toward trying whiz-bang graphics on the Web. If you're not, SVG is cool anyway. The reason for such coolness is what SVG can do. Quoting the W3C specification: 'SVG allows for three types of graphic objects: vector graphic shapes (e.g., paths consisting of straight lines and curves), images and text. Graphical objects can be grouped, styled, transformed and composited into previously rendered objects. The feature set includes nested transformations, clipping paths, alpha masks, filter effects and template objects. SVG drawings can be interactive and dynamic. Animations can be defined and triggered declaratively (that is, by embedding SVG animation elements in SVG content) or via scripting.' SVG has a document object model that provides access to all elements, attributes and properties in an SVG graphic document. There are also event handlers such as the ever-popular 'onmouseover' and 'onclick.' SVG's MIME type will be image/svg+xml when the W3C registers it as such -- apparently around the time when SVG is approved as a W3C recommendation (no date set). The specification also recommends that SVG files should have the extension .svg (all lower case) on all platforms..." See: "W3C Scalable Vector Graphics (SVG)."

  • [June 12, 2001] "XML worth a thousand pics." By Mark Gibbs. In Network World Volume 18, Number 24 (June 11, 2001), page 46. "Last week we were cruel and unusual - we gave you a chunk of Scalable Vector Graphics code but put off explaining it until this week. Now where were we. . . . The code [...] These are standard declarations that declare this is XML and that specify the Document Type Definition. DTD is a set of rules that define elements and attributes of an XML document and spell out how valid documents are structured. In effect, a DTD provides an integrity check on a specific type of XML content... This demo shows the basics of text and transformations under SVG. With SVG, there are three basic drawing elements: text, shapes and paths. Shapes include circles, squares and so on, while paths are chains of line segments that can optionally be specified as closed. You may have already surmised that SVG files, while relatively small, get complex very quickly. Hand coding SVG graphics is not for the faint of heart... To this end, a number of graphics tools have become available that support SVG images - for example, editors such as Adobe's Illustrator 9.0 and Jasc Software's WebDraw ... SVG is a standard to watch. Next week, we'll look at dynamic SVG. . ." See: "W3C Scalable Vector Graphics (SVG)."

  • [June 12, 2001] "SVG Reference in SVG." [Notice posted by] Jiri Jirat. See the SVG. From the post: "Hello XML and SVG developers, we have tried to display our site navigation using SVG: Click on any keyword in the section keywords on http://www.zvon.org/index.php?nav_id=zvonindex... Notice also the difference in sizes - SVG wins with a huge margin (and it is not gzipped!)..."

  • [June 12, 2001] DOM Level 3 Abstract Schemas and Load and Save Specification Version 1.0. W3C Working Draft 07-June-2001. Edited by Ben Chang, Oracle; Andy Heninger, IBM; Joe Kesselman, IBM; Rezaur Rahman, Intel Corporation. Formerly known as 'DOM Level 3 Content Model and Load and Save'. "This specification defines the Document Object Model Abstract Schemas and Load and Save Level 3, a platform- and language-neutral interface that allows programs and scripts to dynamically access and update the content, structure and style of documents. The Document Object Model Abstract Schemas and Load and Save Level 3 builds on the Document Object Model Core Level 3... This chapter describes the optional DOM Level 3 Abstract Schema (AS) feature. This module provides a representation for XML abstract schemas, e.g., DTDs and XML Schemas, together with operations on the abstract schemas, and how such information within the abstract schemas could be applied to XML documents used in both the document-editing and AS-editing worlds. It also provides additional tests for well-formedness of XML documents, including Namespace well-formedness. A DOM application can use the hasFeature method of the DOMImplementation interface to determine whether a given DOM supports these capabilities or not. One feature string for the AS-editing interfaces listed in this section is 'AS-EDIT' and another feature string for document-editing interfaces is 'AS-DOC'. This chapter interacts strongly with the Load and Save chapter, which is also under development in DOM Level 3. Not only will that code serialize/deserialize abstract schemas, but it may also wind up defining its well-formedness and validity checks in terms of what is defined in this chapter. In addition, the AS and Load/Save functional areas will share a common error-reporting mechanism allowing user-registered error callbacks..." See: "W3C Document Object Model (DOM)." [cache]

  • [June 12, 2001] Document Object Model (DOM) Level 3 Core Specification Version 1.0. W3C Working Draft 05-June-2001. Edited by Arnaud Le Hors, IBM; Gavin Nicol, Inso EPS (for DOM Level 1); Lauren Wood, SoftQuad, Inc. (for DOM Level 1); Mike Champion, ArborText (for DOM Level 1 from November 20, 1997); Steve Byrne, JavaSoft (for DOM Level 1 until November 19, 1997). Latest version URL: http://www.w3.org/TR/DOM-Level-3-Core. This specification defines the Document Object Model Core Level 3, a platform- and language-neutral interface that allows programs and scripts to dynamically access and update the content, structure and style of documents. The Document Object Model Core Level 3 builds on the Document Object Model Core Level 2." See: "W3C Document Object Model (DOM)." [cache]

  • [June 11, 2001] "Microsoft's Proposal for Directory Services Markup Language v2.0." Posted by Peter J. Houston (Microsoft Corporation, Group Program Manager, Active Directory). "The Directory Services Markup Language v1.0 (DSMLv1) provides a means for representing directory structural information as an XML document. DSMLv2 goes further, providing a method for expressing directory queries and updates (and the results of these operations) as XML documents. DSMLv2 documents can be used in a variety of ways. For instance, they can be written to files in order to be consumed and produced by programs, or they can be transported over HTTP to and from a server that interprets and generates them. DSMLv2 functionality is motivated by scenarios including: (1) A smart cell phone or PDA needs to access directory information but does not contain an LDAP client. (2) A program needs to access a directory through a firewall, but the firewall is not allowed to pass LDAP protocol traffic because it isn't capable of auditing such traffic. (3) A programmer is writing an application using XML programming tools and techniques, and the application needs to access a directory. In short, DSMLv2 is needed to extend the reach of directories. DSMLv2 is not required to be a strict superset of DSMLv1, which was not designed for upward-compatible extension to meet new requirements. However it is desirable for DSMLv2 to follow the design of DSMLv1 where possible... DSMLv2 focuses on extending the reach of LDAP directories. Therefore, as in DSMLv1, the design approach is not to abstract the capabilities of LDAP directories as they exist today, but instead to faithfully represent LDAP directories in XML. The difference is that DSMLv1 represented the state of a directory while DSMLv2 represents the operations that an LDAP directory can perform and the results of such operations. Therefore the design approach for DSMLv2 is to express LDAP requests and responses as XML documents. For the most part DSMLv2 is a systematic translation of LDAP's ASN.1 grammar (defined by RFC 2251) into XML-Schema. Thus, when a DSMLv2 element name matches an identifier in LDAP's ASN.1 grammar, the named element means the same thing in DSMLv2 and in LDAP..." See also comments in the posting. References: "Directory Services Markup Language (DSML)." Also the announcement: "Microsoft Furthers Adoption of Directory Standards."

  • [June 11, 2001] "Special Report: The Language Of XML Security." By Pete Lindstrom. In Network Magazine (June 2001), pages 56-60. ['While XML documents must be protected from prying eyes and the corrupting influences of the Internet, XML can also be a security tool for many applications.'] "XML provides a structured way to add context to data so that it can be shared among different applications. Where old systems used ASCII text in batch file transfers, XML systems support 'transitional datasets' with predefined data records that are processed in real time through message queues and application servers. But accompanying XML's many advantages are critical security issues. XML is primarily for Internet-based communications; thus, it provides the opportunity for others to sniff or spoof information. For example, while XML allows medical records to be more efficiently shared between multiple parties such as doctors, insurers, and pharmacists, security breaches with such data can have very adverse consequences. . . The primary security issues surrounding XML fall into two basic categories: the security of XML instances themselves, and the use of XML technology to enhance security for a wider range of applications. This article discusses the importance of this distinction, processes that make XML data more secure, and how to apply XML on a broader scale to fortify the security of data exchange...When ad hoc working groups began discussing XML security issues, they basically split into two different directions. One category of groups is focused on the security of XML instances and documents, regardless of their use. The primary topics in this area are the use of digital signatures applied to XML instances (XML Signature) and the encryption of all or partial XML instances (XML Encryption). Another category of groups seeks to leverage XML's capabilities to further broaden security functions. Their primary focus is on the activity of the Organization for the Advancement of Structured Information Standards' (OASIS) Security Services Technical Committee, currently at work on the Security Assertion Markup Language (SAML). Other initiatives include Verisign's XML work, in particular the XML Key Management Specification (XKMS) and XML Trust Assertion Services Specification (XTASS)..." See: (1) "XML Digital Signature (Signed XML - IETF/W3C)"; (2) "XML and Encryption"; (3) OASIS Technical Committee work on Security Assertion Markup Language (SAML) and Access Control.

  • [June 11, 2001] "Architectures in an XML World." By Joshua Lubell (National Institute of Standards and Technology). Paper prepared for Extreme Markup Languages 2001, Montréal, August 17, 2001. "An often overlooked method for schema and DTD reuse is the specification of architectures. An architecture is a collection of rules (for creating and processing a class of documents) that application designers can apply in defining their XML vocabularies. An XML document using an architecture contains special architecture-support attributes that describe how its elements, attributes, and data correspond to their architectural counterparts. Software tools for processing architectures are called architecture engines. APEX is a non-validating generic architecture engine written in XSLT. Input to APEX consists of an XML document plus stylesheet parameters identifying an architecture used by the document. APEX produces as output an architectural document conforming to the architecture specified by the stylesheet parameters and the input document's architecture support attributes. Experience with APEX demonstrates that architectures and XSLT are complementary and that architectures can fulfill a role not well served by alternative approaches to reuse." APEX description from the web site: "APEX (Architectural Processor Employing XSLT) implements a simple subset of the Architectural Form Definition Requirements (AFDR) specified in Annex A.3 of ISO/IEC 10744:1997. APEX behaves similarly to David Megginson's XAF package for Java and differs from the AFDR in the same ways as XAF. Input to APEX consists of an XML document plus stylesheet parameters identifying an architecture used by the document. APEX produces as output an architectural document, i.e., an XML document containing only the markup and data defined by the architecture specified..." For background, see the introduction to Architectures, Architectural Forms, Architecture Support Attributes, and Architectural Processing. Source: see the program listing. References: "Architectural Forms and SGML/XML Architectures." **2001-08-02 Note: see "Architectures in an XML World" online.

  • [June 11, 2001] "W3C XML Schema Made Simple." By Kohsuke Kawaguchi. From XML.com. June 6, 2001. ['The W3C XML Schema Definition Language can be easy to learn and use, claims Kohsuke Kawaguchi -- you just need to know what to avoid.'] "It's easy to learn and use W3C XML Schema once you know how to avoid the pitfalls. You should at least learn the following things. (1) Do use element declarations, attribute groups, model groups, and simple types. (2) Do use XML namespaces as much as possible. Learn the correct way to use them. (3) Do not try to be a master of XML Schema. It would take months. (4) Do not use complex types (why?), attribute declarations (why?), or notations (why?). (5) Do not use local declarations (why?). (6) Do not use substitution groups (why?). (7) Do not use a schema without the targetNamespace attribute (aka chameleon schema.) (why?) You won't lose anything by following these guidelines, as the rest of this article demonstrates. Too long to remember? Then try the one-line version: 'Consider W3C XML Schema as DTD + datatype + namespace'. The rest of this article justifies these recommendations... There are many pitfalls in XML Schema that should be avoided, which will make your life easier because you'll have less to learn. And you won't lose the expressiveness of W3C XML Schema..." Note: The "advice" in this article is obviously a collection of personal opinion; compare the work of Roger L. Costello in "XML Schemas: Best Practices Homepage."

  • [June 11, 2001] "XML Q&A: Big Documents, Little Attributes." By John E. Simpson. From XML.com. June 6, 2001. ['This month our Q&A column tackles storing large numbers of records in XML ("Q: How do I process a big XML document?"), and explains the use of attribute definitions in DTDs ("Q: I'm confused about specifying attribute values in a DTD".']

  • [June 11, 2001] "Transforming XML: Using the W3C XSLT Specification." By Bob DuCharme. From XML.com. June 6, 2001. ['For advanced XSLT use, the W3C's XSLT specification can be a handy tool. This guide helps you read the specification and clears up confusing terms.'] "The W3C's XSLT Recommendation (available at http://www.w3.org/TR/xslt) is a specification describing the XSLT language and the responsibilities of XSLT processors. If you're new to XSLT, the Recommendation can be difficult to read, especially if you're not familiar with W3C specifications in general and the XML, XPath, and Namespaces specs in particular. This month, I'd like to summarize some of the concepts and terms that are most likely to confuse an XSLT novice reading the XSLT Recommendation..." For related resources, see "Extensible Stylesheet Language (XSL/XSLT)."

  • [June 11, 2001] "XML-Deviant: Time for Consolidation." By Leigh Dodds. From XML.com. June 6, 2001. ['Is XML changing the way applications are being designed? If so, what tools should you use to model these applications?'] "As Edd Dumbill noted in last week's XML-Deviant, the XML Schema specification may have been delivered, but the discussions are far from over. The Schema Working Group have delivered a meaty specification, and it will take some time for developers to digest. Expectations have already been raised about the features that may be delivered in Schema 1.1, and the prospect of Schema 2.0 has already been considered. It's premature to begin thinking too much about what these specifications might encompass until there's sufficient Schemas experience to allow it to be assessed. We must not forget the continuing work on other schema languages, most notably RELAX NG, the unification of RELAX and TREX being carried out at OASIS. Michael Fitzgerald observed that it's 'some of the more important work happening in XML now.' Rick Jelliffe, a strong advocate of a plurality of schema languages, characterized the current situation as an interim period, and XML Schema 1.0 as a 'provisional' specification..." See "RELAX NG."

  • [June 09, 2001] "[W3C] XML Schema Tutorial." By Roger L. Costello (of xFront.com XML Technologies). The main tutorial is a PPT slide set with some 276 slides. The slides reference 36 worked examples and 14 lab exercises. From the June 9, 2001 update note: "The tutorial is now updated to the Recommendation specification (i.e., the latest W3C specification). It includes a complete set of labs with answers. All examples and lab answers are complete and have been validated using Henry Thompson's schema validator, xsv [self-installing Win32 .exe], which is bundled in with the tutorial (thanks Henry!). It also includes a Javascript program, written by Martin Gudgin, that enables you to use MSXML4.0 (thanks Martin!)... I have provided a number of DOS batch files (i.e., validate.bat, run-examples.bat, run-lab-answers.bat) to make it easy for you to schema validate your XML files. I am continually adding new material to this tutorial. Please check back periodically for updates..." Related references in "XML Schemas."

  • [June 08, 2001] "Understanding ebXML. Untangling the business Web of the future." By David Mertz, Ph.D. (Phenomenological unifier, Gnosis Software, Inc.). From IBM developerWorks. June 2001. ['ebXML is a big project with a lot of pieces. In this article David Mertz outlines how the pieces all fit together. This overview provides an introduction to the ebXML concept and then looks a bit more specifically at the representation of business processes, an important starting point for ebXML implementations. Two short bits of sample code demonstrate the ProcessSpecification DTD and a package of collaborations.'] "When you read about ebXML, it's difficult to get a handle on exactly what it is -- and on what it isn't. The 'eb' in ebXML stands for 'electronic business,' and you can pronounce the phrase as 'electronic business XML,' 'e-biz XML,' 'e-business XML,' or simply 'ee-bee-ex-em-el.' On one hand, ebXML seems to promise a grand unification of everything businesses do to communicate with each other. On the other hand, one could be forgiven for thinking that ebXML amounts to little more than a pious, but vacuous, declaration that existing standards are worth following. As with every 'next big thing,' the truth lies somewhere in the middle... Sorting out ebXML involves a few steps. Perhaps the first thing necessary for understanding the details of ebXML is to digest an alphabet soup of new acronyms and other special terms. There are a number of these terms in the sidebar (ebXML terminology) to consider before looking at the whole 'vision' of ebXML interactions. Additional terms fit into the entire system, but these particular terms make a good starting point. With this new vocabulary in mind, and a bit of the following background on where ebXML comes from, you can begin to make sense of how all of the differing processes in ebXML hold together. After describing what ebXML does (at least in outline) at the beginning of this article, a final section looks in more detail at the Business Process Specification Schema, which makes up one of the most important elements of ebXML's underlying infrastructure... The UN/CEFACT Modeling Methodology (UMM), which utilizes UML, may be instrumental in modeling the ebXML Business Processes. However, such modeling is simply a recommendation, not a requirement. In any case, since this article targets XML developers and does not address OOD (object-oriented design), it is more interesting herein to look at the representation of the models in XML documents conformant to the Business Process Specification DTD and XML Schema. The DTD (named 'ebXMLProcessSpecification-v1.00.dtd') appears, at this time, to be the primary rule representation. Both this DTD and a W3C XML Schema, which is (presumably) semantically and syntactically compatible, may be found in the EbXML_BPschema_1.0 recommendation... ... The approval of ebXML specifications is moving along at a fairly rapid pace (certainly for a standards organization). My own estimation is that it will take another year or two to shake out all of the issues and details for such an ambitious vision. It appears, however, that ebXML is on the way to widespread use a few years down the road. Now is the time, therefore, for businesses to begin a serious consideration of their own ebXML implementation plans." Note especially the sidebar, "ebXML Terminology." See: "Electronic Business XML Initiative (ebXML)."

  • [June 06, 2001] "W3C Works on Standards Development." By Stephen Lawson. In InfoWorld (June 1, 2001). "Conscious that the future success of e-commerce and Web services hinges on interoperability between different vendors' products, the World Wide Web Consortium (W3C) attempted to fill in some of the gaps in the current array of standards at its 10th annual World Wide Web Conference last month in Hong Kong. The W3C's painstaking standards work is critical for enabling companies to use the Web for commerce, according to Roger Cutler, a senior staff research scientist at the Chevron Information Technology division of Chevron U.S.A. and a member of the W3C for the past year. With this in mind, IBM revealed it is preparing to propose to the W3C a new standard dubbed the WSFL (Web Services Flow Language). WSFL is designed to describe how a series of functions would work in providing Web services, according to Robert Sutor, director of e-business standards strategy at IBM in Somers, N.Y. It would help developers and corporate users specify the many pieces they need to plug into workflow applications or business processes and the sequence in which they should operate, he says. For example, WSFL might include a way to describe how well a service, such as a transaction engine, should perform. That would enable a Web services provider to guarantee QoS (quality of service), according to Sutor. Getting the industry to agree on such workflow standards has been a common problem over the years, Sutor says, acknowledging that other initiatives compete with the WSFL, including ones from the Workflow Management Coalition and the Business Process Management Initiative. But Sutor senses a growing desire among many to consolidate the various proposals and technologies into one, and adds that he doesn't see why intelligent compromises can't be made. '[WSFL] is not something we are trying to force down people's throats as the de facto standard. We think it has a lot of good ideas in it that are very consistent with some of the other Web services standards people are working on,' Sutor says. Sutor feels confident that the W3C will get a Web Services workflow group charted by year's end, which can bring a number of proposals together into a single, cohesive standard..." See: "Web Services Flow Language (WSFL)."

  • [June 06, 2001] "Microsoft Continues Web Service Leadership With New XML Specs." By David Smith [Gartner Internet Strategies]. 25-May-2001. ['Microsoft's posting of specifications for three XML technologies again shows its leadership in developing Web service standards and may herald another cooperative effort with IBM to get a new standard approved by the World Wide Web Consortium (W3C)'] "...The announcement of these new specifications indicates Microsoft's continued leadership in XML standards development. Microsoft has previously demonstrated with SOAP, WSDL and, to some extent, UDDI that its first step toward standardization of these technologies is to post the specifications publicly. Six to 12 months later Microsoft submits the specifications to a standards organization, typically the W3C. That body's working group for XML Protocols (XMLP), which focuses on standardizing specifications for Web service technologies, is scheduled to produce its final recommendations by August 2001 and to disband by April 2002. Gartner believes Microsoft's introduction of these three technologies will convince W3C to extend the XMLP Working Group to focus on additional technologies and will extend the group by at least one year (0.8 probability). The addition of SOAP-RP allows SOAP to be routed through intermediate transports. Although SOAP 1.1 was already independent of HTTP transport, a single transport was required for an entire SOAP interaction. DIME allows for richer binary content such as images and audio to be more efficiently handled in an infrastructure optimized for XML text-based payloads. XLANG, the language implemented in BizTalk, which allows orchestration of Web services into business processes and composite Web services, is perhaps the most important of the three new specifications. Microsoft previously achieved recognition for WSDL by working with IBM. History may repeat itself here since IBM now has a similar technology to XLANG: In April, IBM published WSFL (i.e., Web Services Flow Language). Gartner expects IBM and Microsoft to jointly agree to submit a proposal to W3C that combines XLANG and WSFL by year-end 2001 (0.7 probability)..." See: (1) "XLANG" and (2) "Microsoft Publishes XML Web Services Specifications."

  • [June 06, 2001] "Web Services and XML Technologies CD." From IBM developerWorks. Announced in a posting from Jeffrey I Condon. "The Web services CD contains a selection of tools, examples, and articles for designing and developing Web services applications. It contains all the Web services applications that IBM has released to the general public, in addition to other useful tools that you may need. In addition to the software, the CD contains a set of all recent articles that have been published on the developerWorks Web services zone. These articles provide background information, tutorials, and news covering the protocols, techniques, and code used for creating Web services... IBM developerWorks is offering a Web Services and XML technology CD containing the following resources: All IBM developerWorks Web services articles; IBM Web services whitepapers; IBM developerWorks Web services newsletter; Web Services ToolKit; WSDL ToolKit; Web services Development Environment; Web Services Process Management ToolKit; Gourmet2Go Web services application; AggregationDemo Web services application; The IBM MQSeries transport for SOAP; Web Services Browser plug-in; WebSphere Preview Technologies for Developers; Tivoli Managment Extensions for Java (TMX4J). Contact: Jeffrey I Condon.

  • [June 06, 2001] "XML for Visio Scenarios." From Microsoft Corporation. June 2001. ['This article illustrates how XML for Visio can be used to extract Visio data for use in solution development, data analysis, text localization, Web publication, and database interoperability.'] "This article describes a new file format, XML for Visio, for native data in Microsoft Visio 2002. Extensible Markup Language (XML) is a tagged data format that is platform independent, vendor neutral, standardized by the World Wide Web Consortium (W3C), and widely available. XML is actually a metalanguage that forms the basis for other languages or vocabularies. Combined with W3C open standards and the ability to provide its own data definitions, XML is an enabling technology that provides the syntax for the expression of rich open data formats. The standard provides language and character set neutrality, unambiguous rules for white space, escape characters, extensibility, mixing of data models, and other syntactic details. Visio 2002 has defined an XML vocabulary (schema) that expresses all the Visio drawing, template and stencil data in its internal model. The XML extension rules allow users to attach and maintain custom data to a drawing. The familiarity of XML and the availability of standard tools give applications access to the Visio model without requiring the full Visio application. Open standards such as XML expand the opportunities and means of sharing and exchanging data. The XML for Visio scenarios in this article highlight some of the potential uses for this new format. If you are familiar with Visio but new to XML, the following summary will help you understand how to use the new XML for Visio format to its best advantage... XML for Visio Format: All types of Visio documents (drawings, stencils, and templates) can be saved in the XML for Visio format. Visio provides tag definitions for its document data in the XML for Visio schema, a separate document that lists the tags and their containment relationships. The schema generally follows the Visio object model and has predefined places for customized tags, which solution providers can use for preserving custom data. Solution providers can extract any XML data from the Visio documents for external processing by using existing XML tools, and then modify that data or create new drawings to display the results. Solution providers can extract customized shape definitions from the Masters section of the XML for Visio tag hierarchy. These shapes can then be shared, modified, or included in their custom solutions. Solution providers may be able to convert drawings to and from other drawing file formats using the XML for Visio schema, and by using XML as an import/export file format." See also (1) "Visio 2002 Incorporates XML Support with XML for Visio Format", (2) the Visio Developer Center, and (3) Visio Schema XDR, available for download.

  • [June 06, 2001] "Securing XML Documents with Author-X." By Elisa Bertino, Silvana Castano, and Elena Ferrari. In IEEE Internet Computing Volume 5, Number 3 (May/June, 2001). ['This Java-based access-control system supports secure administration of XML documents at varying levels of granularity.'] "The widespread adoption of XML for Web-based information exchange is laying a foundation for flexible granularity in information retrieval. XML can 'tag' semantic elements, which can then be directly and independently retrieved through XML query languages. Further, XML can define application-specific document types through the use of document type definitions (DTDs). Such granularity requires mechanisms to control access at varying levels within documents. In some cases, a single-access control policy may apply to a set of documents; in other cases, different policies may apply to fine-grained portions of the same document. Many other intermediate situations also arise... The typical three-tier architecture for accessing an XML document set over the Web consists of a Web client, network servers, and the back-end information system with a suite of data sources. In this framework, public-key infrastructures (PKIs) represent an important development for addressing security concerns such as user authentication. But such facilities do not provide mechanisms for access control to document contents nor for their release and distribution. Author-X is a Java-based system, developed at the University of Milan's Department of Information Science, to address the security issues of access control and policy design for XML documents. Author-X supports the specification of policies at varying granularity levels and the specification of user credentials as a way to enforce access control. Access control is available according to both push and pull document distribution policies, and document updates are distributed through a combination of hash functions and digital signature techniques. The Author-X approach to distributed updates allows a user to verify a document's integrity without contacting the document server. In this article, we will first illustrate the distinguishing features of credential-based security policies in Author-X, then examine the system's architecture, and conclude with details about its access-control and administration engines... In general, security policies state who can access enterprise data and under which modalities. Once policies are stated, they are implemented by an access-control mechanism. In Author-X, security policies for XML documents have the following distinguishing features: They can be set-oriented or instance-oriented, reflecting support for both DTD- and document-level protection. They can be positive or negative at different granularity levels, enforcing differentiated protection of XML documents and DTDs. They include options for controlled propagation of access rights, whereby a policy defined for a document or DTD can be applied to other semantically related documents and DTDs (or portions of them). They reflect user profiles through credential-based qualifications. Author-X security policies are implemented through six basic components: User Credentials, Protection Objects, Access Modes, Signs, Propagation Options, and Policy Base... The Web community generally regards XML as the most important standardization tool for information exchange and interoperability, and we believe that XML access control will constitute the core security mechanism of Web-based enterprise architectures. The current Author-X prototype is built on top of the eXcelon XML server and supports browsing and updating of DTD-based XML sources. We plan to extend protection toward secure access to Web pages and compliance with XML schemas. Additionally, we will experiment with incorporating Author-X within Web-based enterprise information system architectures by focusing on performance issues. In particular, we will study XML-based solutions for certifying user credentials as well as access-control schemes and architectures for securely disseminating information. We have proposed a preliminary set of XML-based access-control schemes for distributed architectures, and we are working on a prototype extending the Author-X functionalities accordingly."

  • [June 06, 2001] "JXTA: A Network Programming Environment. [Industry Report.]" By Li Gong (Sun Microsystems). In IEEE Internet Computing Volume 5, Number 3 (May/June, 2001). "JXTA technology is a network programming and computing platform that is designed to solve a number of problems in modern distributed computing, especially in the area broadly referred to as peer-to-peer computing, or peer-to-peer networking, or simply P2P... JXTA technology is designed to provide a layer on top of which services and applications are built. We designed this layer to be thin and small, while still offering powerful primitives for use by the services and applications. We envision this layer to stay thin and small as this is the best approach both to maintaining interoperability among competitive offerings from various P2P contributors... In theory, JXTA can be independent of any format used to encode advertisement documents and messages. In practice, it uses XML as the encoding format, mainly for its convenience in parsing and for its extensibility. Three points worth noting about the use of XML: If the world decides to abandon XML tomorrow and uses YML instead, JXTA can be simply redefined and recoded to use the YML format. The use of XML does not imply that all peer nodes must be able to parse and create XML documents. For example, a cell phone with limited resources can be programmed to recognize and create certain canned XML messages, and still participate in a network of peers. To keep version 1.0 small, we used a light-weight XML parser that supports a subset of XML. We are working toward normalizing this subset according to an existing effort called MicroXML. JXTA provides a network-programming platform specifically designed to be the foundation for peer-to-peer systems. As a set of protocols, the technology stays away from APIs and remains independent of programming languages. This means that heterogeneous devices with completely different software stacks can interoperate through JXTA protocols. JXTA technology is also independent of transport protocols. It can be implemented on top of TCP/IP, HTTP, Bluetooth, Home-PNA, and many other protocols. We have developed a JXTA Shell, similar to the Unix shell, for writing scripts. Like the Unix shell, the JXTA Shell helps users learn a lot about the inner workings of JXTA during the process of writing scripts..."

  • [June 06, 2001] "Java Vendors Need to Broaden Standards Support." By Mitch Wagner. In InternetWeek (June 4, 2001). "While vendors have been doing a good job using standard interfaces to build new, Java-based e-business applications, they need to go further to be sure that the applications interact with legacy enterprise software, said analysts and users. Leading Java software vendors such as Sun Microsystems, IBM, Hewlett-Packard, Oracle and BEA Systems are using Java 2 Enterprise Edition (J2EE) and associated standards to build new applications with browser-based front ends, and connect those applications to back-end legacy systems, said Nick Gall, an analyst with Meta Group. But the result is often a 'stovepipe' application that can't interoperate with other applications, he said. While vendors are supporting standards in building application servers, they need to commit to taking the next step and supporting standards in their Enterprise Application Integration (EAI) platforms. Right now, companies like IBM with WebSphere and BEA with WebLogic support many EAI standards, but also compete with each other by offering incompatible, proprietary technologies in areas such as workflow and messaging, Gall said... BEA, Hewlett-Packard, Oracle and Sun responded to the call for standards by extending their Java middleware to incorporate new standardized APIs. The new versions of the applications were introduced at the Sun JavaOne conference, the annual gathering of Java developers, in San Francisco this week. [week of 2001-04-04] BEA introduced WebLogic 6.1, which automatically binds Java 2 Enterprise Edition (J2EE) applications to Web services standards. Developers write applications as Enterprise Java Beans (EJBs), and WebLogic Server 6.1 automatically adds the appropriate Simple Object Access Protocol (SOAP) and Web Services Description Language (WSDL) interfaces to allow those applications to be invoked as services over the Web. BEA is also introducing several components in the WebLogic Integration 2.0 server. The server support for Java Connector Architecture (JCA), a standard interface for invoking enterprise applications over the Internet. WebLogic Integration also includes translators for EDI to XML, to allow developers to use EDI software as a service. And the WebLogic Integration server includes an extensible framework for XML protocol support, to allow developers to plug in modules for various XML protocols, including BizTalk, RosettaNet, ebXML. The server also includes a business-process integration module, to allow developers to create business processes using a Visio-style interface, and then expose those business processes to operate as services over the Internet. Hewlett-Packard shipped Total-e-Server Version 7.3, the HP application server. The new version adds JCA support to connect the Web server with enterprise software. The company also plans to introduce the HP Internet Server, which HP said it considers to be either a high-end Web server or a low-end application server, depending on how you look at it. The server runs HTTP, JSP and Java Servlets, which are server-side Java programs..."

  • [June 06, 2001] "Securing Web Services using the Java Platform and XML." By Andrew Brown, Loren Hart, and Monica Pawlan. From Java Developer Connection. June 15, 2001. "In today's fast-moving world of e-commerce and information technology, savvy companies realize that to stay competitive they have to make their products and services available over the Internet. Application-to-application cooperation and communication where one company needs the products or services of another to conduct business is at the core of Web-based business-to-business communications. To enable smooth, reliable, secure, and standardized cooperation and communication, more and more companies are taking advantage of Web services. Initiatives like the Universal Discovery, Description, and Integration (UDDI) specification define ways to discover and integrate Web-based services from all over the world. Sun Microsystems with its new Web services strategy is no exception especially given that its platform-independent and versatile Java technology is ideal for developing Web services... a Web service might be made up of companies (providers) in the same business sector who create software standards for setting up services to buy and sell parts. A Web service architecture is made up of providers who publish the availability of their services; brokers who register and categorize provider services and make search engines available; and requesters who use brokers to find a provider service. Web service providers need a communication standard and a way to verify the identity of companies and individuals with whom they are doing business. Extensible Markup Language (XML) has become the communication standard and Public Key Infrastructure (PKI) the verification standard. This article describes an example scenario where companies cooperate and communicate over the Internet to buy and sell parts. It also presents an example program written in the Java programming language that uses VeriSign's Trust Web service, an implementation of the XML Key Management Specification, to do cryptographic key management over the Internet using XML messaging... The XKMS specification is open, which means any company can implement an XKMS service and count on full interoperability. To encourage developers to begin using these new Web services, VeriSign has sponsored a site devoted to XML Trust Services, called the XML Trust Center, where developers can find the Java implementation of the XKMS client API. The XKMS client API includes an implementation of the XML Digital Signature specification, which provides API packages for digitally signing XML documents. An application can use Java APIs to generate cryptographic key pairs and use XKMS APIs to register those keys with an XKMS service. Public-private key pairs are registered with an XMKS service by sending the proper information about the keys in an XKMS XML message. This combination of APIs and services lets applications offload all key management operations, including key revocation in the event a key is compromised, and key recovery in the event a key is lost..." See: "XML Key Management Specification (XKMS)."

  • [June 06, 2001] "Souping Up Wireless." By Anne Chen. In ZDNet Ecommerce (June 3, 2001). "Sometimes Cindy Groner must feel like she's swimming in circles in a sea of acronyms. As director of mobile traveler services at Sabre Inc., Groner oversees a team of wireless developers who, in the process of coming up with new wireless services for Sabre's customers, must create multiple versions of each page. One in the Wireless Markup Language format for Wireless Application Protocol devices. Another in Handheld Device Markup Language for devices using the Openwave Systems Inc. browser. And another in Compact HTML for Nippon Telephone and Telegraph Corp.'s i-mode devices. Every time content or wireless services change, developers must test the application on multiple devices to make sure the experience is the same on all platforms, a strenuous process. . . Groner said she believes help is on the way. And it doesn't even matter that it's coming in the form of yet another acronym: XHTML. It stands for Extensible HTML, and it's a rapidly emerging standard that could soon allow e-businesses such as Sabre to write online applications just once and deliver them across multiple platforms -- whether wireless or PC-based. Having spent millions of dollars developing and deploying wireless applications for multiple devices -- cell phones, PDAs (personal digital assistants) and televisions -- companies such as Sabre, IBM and Edmunds.com Inc. now are already planning to make XHTML a critical component of their wireless e-business strategies by pilot testing the new standard even before it's supported on mobile devices or embedded into networks... XHTML solves the problem by being more modular and structured than its predecessor. Essentially a marriage between HTML and XML (Extensible Markup Language), XHTML is able to ensure that only code suitable for smaller browsers is transmitted to wireless devices, something HTML is unable to do. Because its mobile subset, XHTML Basic, is essentially the same language designed to deliver Web content to devices ranging from mobile phones and PDAs to pagers and television-based browsers, it lets programmers write content for PCs and mobile devices at the same time without conversion. XHTML Basic, which was recommended as a standard by the W3C last December, includes features from many existing wireless protocols, enabling developers to take advantage of the larger color screens and greater graphics capabilities of new devices designed for networks that run at higher speeds. The wireless industry's big guns, including the WAP Forum and Nippon Telegraph and Telephone, of Tokyo, have already announced that they will support XHTML as the standard for next-generation browsers and other mobile devices (see story below). Handset manufacturers Nokia Corp., Motorola Inc. and Ericsson SpA, and mobile operators Vodafone Group plc., Orange SA and Telecom Italia SpA are also backing the standard and plan to develop products, content and services based on XHTML..." See: "XHTML and 'XML-Based' HTML Modules."

  • [June 06, 2001] "Services Battle to Heat Up." By Roberta Holland. In eWEEK (June 4, 2001). "The legal wranglings between Microsoft Corp. and Sun Microsystems Inc. may be over, but the competitive ones are not. This week, as Sun executives preach the Java gospel at the JavaOne conference, they will wage a battle on a new front: Web services. While the Sun ONE (Open Net Environment) strategy is seen as lagging behind Microsoft's .Net, Sun will prop up Java 2 Enterprise Edition as the best platform for Web services, trying to leverage J2EE's success in the enterprise. Sun also will outline its road map for incorporating additional Web services standards into J2EE. 'Certainly the competition of the next year or two is going to be in Web services, and, right now, it seems Microsoft is out in front,' said Peter Horan, CEO of DevX.com Inc., an online resource for developers. But Horan, in Palo Alto, Calif., said there is still tremendous momentum around Java the language, adding that better tools are necessary to keep Java adoption growing. "Corporate developers and IT managers still believe the tool sets for C++ and Visual Basic are more fully developed," he said. While Java tools have improved greatly since the language's early days, both Sun and Microsoft need to deliver tools to build Web services, developers said. Both companies have sought to enlist partners in the battle, including Bowstreet Inc., Genuity Inc. and i2 Technologies Inc. on the Sun side and eBay Inc., Fujitsu Software Corp. and ActiveState Tool Corp. for Microsoft. Chief among Sun's goals this week is to show how it is easier for developers to build Web services using Java. Among the announcements at the conference in San Francisco will be a bundle for developers of Java APIs for XML (Extensible Markup Language) parsing, packaging and routing. Sun will also unveil a new version of its Forte for Java tool, with native support for XML; SOAP (Simple Object Access Protocol); Universal Description, Discovery and Integration; and Web Services Description Language... Oracle Corp., of Redwood Shores, Calif., will unveil its Oracle9i application server, its first version to be certified J2EE- compliant, with performance upgrades, SOAP support and new caching technology. The middleware division of Hewlett-Packard Co., in Palo Alto, Calif., will release its implementation of Java Services Framework, a specification that describes how to assemble components into Java server applications, along with a new Internet server. WebGain Inc. plans to release upgrades to several of its Java tools, including the WebGain Studio suite..."

  • [June 05, 2001] "XSD for Visual Basic Developers." By Yasser Shohoud. From the DevXpert Web Services Depot [for VB Developers]. May 2001. "The W3C's XML Schema is sometimes referred to as XML Schema Definition language or XSD for short. XSD is an XML-based grammar for describing the structure of XML documents. A schema-aware validating parser, like MSXML 4.0, can validate an XML document against an XSD schema and report any discrepancies. To solve the [invalid invoice document] problem outlined above, you'd create an XSD schema that describes the invoice document. You'd then make this schema available to the UI tier developers. The schema is now part of the 'interface contract' between the middle tier and the UI. While the application is in development, the UI tier can validate the invoice documents that they send against that schema to ensure they are valid. Similarly, the SaveInvoice function can validate the input invoice document against the schema before attempting to process it. Now if you change the invoice document to support a new feature, you must change the schema accordingly. Now the UI team tries to validate the invoice documents they're sending and this validation fails so they immediately realize that the schema has changed and that they must change the invoice documents they are sending. This can also help catch version mismatch problems where you have an older client trying to talk to a newer middle tier or vice versa.... In this brief introduction to XSD, you've seen how you can make a Visual Basic class to an XSD schema and how to use that schema with MSXML 4.0 to validate documents. You also learned the relation between XSD and XML namespaces and how namespaces can be used to combine elements from different schemas in one XML document. This tutorial barely scratches the surface of what you can do with XSD schemas. There are many more features and details you might be interested in (or might not care about). Once you are comfortable with the concepts explained in this tutorial, check out the XML Schema Primer (part of the XSD specification) which goes into a lot more details about XSD with many examples..." See "XML Schemas."

  • [June 05, 2001] "Introduction to UDDI." From the DevXpert Web Services Depot [for VB Developers]. June 02, 2001. ['This article walks you through the basics of the Universal Description, Discovery, and Integration. It describes scenarios where UDDI would be useful and shows you how you can implement UDDI in those scenarios. Learn what UDDI is all about, how it works, and how you can program it.'] "One of the primary potential uses of Web services is for business-to-business integration. For example, company X might expose an invoicing Web service that the company's suppliers use to send electronic invoices. Similarly, a vendor V might expose a Web service for placing orders electronically. If company X wanted to purchase computer equipment electronically, it would need to search for all vendors who sell computer equipment electronically. To do this, company X needs a yellow pages-type directory of all businesses that expose Web services. This directory is called Universal Description, Discovery, and Integration or UDDI. UDDI is an industry effort started in September of 2000 by Ariba, IBM, Microsoft, and 33 other companies. Today, UDDI has over 200 community members. Like a typical yellow pages directory, UDDI provides a database of businesses searchable by the type of business. You typically search using business taxonomy such as the North American Industry Classification System (NAICS) or the Standard Industrial Classification (SIC). You could also search by business name or geographical location... If you write commercial business software, you should start thinking about leveraging UDDI to make it easy for your software users to publish their Web services and to find other Web services that they need. If you work inside a large organization with several divisions each busy building Web services, you should consider using an internal, UDDI-like, registry of Web services that are available within your organization. Whether for commercial or internal uses, you can program the UDDI APIs directly by sending and receiving SOAP messages. If you program in a COM-aware language you can use the Microsoft UDDI SDK, which handles all the SOAP and XML work and lets you program against a COM-based object model." See: "Universal Description, Discovery, and Integration (UDDI)."

  • [June 05, 2001] "Sun Redraws Java Blueprint Around Web Services." By Mark Leon, Ed Scannell, and Eugene Grygo. In InfoWorld (June 4, 2001). "In the latest move in its competition with IBM and Microsoft, Sun this week at its JavaOne developer conference will leave no room for doubt: The Web services race is on and Sun is in the running. As next-generation software development converges around XML-based standards, Sun this week will recast the Java 2 Enterprise Edition (J2EE) and its own products to coexist with Web services standards. Sun's competitors, notably BEA Systems and IBM, will also detail their plans to tie Java and Web services at the conference. Sun officials argue that the Web services concept -- Web-centric applications loosely coupled with XML -- breathes new life into Java development and its Sun ONE framework... In a bid to get more developers to build those applications with Java, Sun will make several announcements detailing new Web services support in its iPlanet Application Server products and Forte development tools. Developers will get access to a J2EE Service Pack designed to simplify the creation of XML-based services. 'You will be able to visually develop Enterprise JavaBean [EJB] components, assemble them into J2EE applications, and then automatically deploy them to the iPlanet Application Server,'" said Sanjay Sarathy, director of product marketing for the Application Server Group at Sun. The Service Pack will be available soon for developers to download and will become part of J2EE with the Version 1.4 release sometime next year... Sun also will seek to make its Forte for Java development environment more attractive to less technically savvy developers. One feature, called Java Web Services Designer, will allow Web developers who work with Macromedia's Dreamweaver or Adobe's GoLive products to access an XML services-based registry... Also included in Forte for Java release 3.0, due this summer, is a set of wizards that will automatically bind Java and XML so that developers can more easily create Web services and publish them in a registry." See the announcements: (1) "Industry Effort to Define Native Web Services Support in J2EE. Industry Leaders Band Together Using Java Community Process"; (2) "Web Services Pack to Simplify Building Java-Based Web Services. Major Vendors to Integrate Open Technologies in Java Web Services Tools."

  • [June 05, 2001] "Gates launches Office XP." By Jennifer DiSabatino. In ComputerWorld (May 31, 2001). "With much fanfare, including rock music and flashing lights, Microsoft Corp. Chairman and Chief Software Architect Bill Gates today officially launched the latest version of his company's ubiquitous Office software known as Office XP. Gates was in full marketing mode as he led an hour and a half of Office XP feature demos and testimonials... Gates also touted XML as an integral part of Office XP. 'We're designing all our software products from the ground up around XML,' he said. 'Office XP is the first version of Office that supports XML.... It's our view that XML is going to unlock a lot of business processes that have been paperbound and bring them onto the network, and we need to use the standard Office interface as the way that people can navigate that information'... There's enough benefit to XP to skip Office 2000, Silver said, adding that XP is more like an upgrade of 2000 anyway. Users should still take precautions, he said, by testing the software to make sure it's stable in a given environment and then deploying from there. Currently, about 245 million people worldwide use Office products, according to David Bennie, Microsoft group manager for Office/Exchange and product marketing. In addition to the features outlined last week for the new version of Outlook, the e-mail software that comes bundled in the Office software, Microsoft has also added smart tags, which link content in Word documents to Web sites. There is also better version control on Word documents, with revisions color-coded and placed in the margins automatically when the author merges different versions in the main document. Office XP is also tightly integrated with the Share Point Portal system, a knowledge management tool and collaboration application..." See the announcement: "Gates Demonstrates at Office XP Launch How Office XP Unlocks Hidden Knowledge And Unleashes Next Wave of Productivity Gains. Ford, Amazon.com, UPS and LexisNexis Show How Office XP Dramatically Improves Personal and Business Productivity."

  • [June 05, 2001] "DIDL: Packaging Digital Content." By Vaughn Iverson, Todd Schwartz, and Mark Walker. From XML.com. May 30, 2001. ['Internet applications generally fall short in their ability to transfer multimedia content. This article describes an XML vocabulary for packaging digital content, breaking the one-to-one mapping between the notion of a content item and an individual file.'] "In this article we detail the reasons for undertaking the development of a digital packaging standard and describe in depth a package manifest scheme that potentially addresses the enumerated needs. In doing so, we show how such a scheme effectively disassociates the notion of content item from individual files. We conclude by describing an XML vocabulary, the Digital Item Declaration Language (DIDL), a recently released first working draft from ISO/MPEG that will, when completed, provide standard means for packaging digital content... Today's popular Internet applications generally fall short in their ability to transfer raw resource content. The content of a web page for example may be defined as the collection of discrete resources -- bitmaps, JPEG images, text blocks, and so on -- that are aggregated within some predetermined format. The components of the web page may possess attributes and relationships that, while not explicitly part of the final, viewable form, may be critical in generating the displayed result. Information accompanying a JPEG image, for example, could be utilized in creating a photo caption. Information about the relationships among a group of images could be utilized in locating the images on the page. If the web page is generated from a script, information on the sizes of the various images could be utilized to decide which images to begin downloading first... Internet-transacted digital content is a reality, but the lack of standards makes it very difficult for non-technical users to obtain and transmit content. Content that is transacted generally is not interoperable across platforms and is still tightly bound to the directory/file paradigm which greatly limits its flexibility. The MPEG-21 Digital Item Declaration Language addresses these and related problems by providing a relatively simple, standard method for describing complex, multicomponent content source collections." [Note: see the MPEG-21 Overview and related XML design work in the MPEG-7 (Multimedia Content Description Interface) activity of the Moving Picture Experts Group under ISO/IEC JTC1/SC29/WG11. The MPEG-7 is a 'content representation standard for multimedia information search, filtering, management and processing'; the WG has produced the "Description Definition Language (DDL)" as an XML-based specification for multimedia metadata.] See (1) "Moving Picture Experts Group: MPEG-7 Standard," and (2) "MPEG-21 Part 2: Digital Item Declaration Language (DIDL)."

  • [June 05, 2001] "The State of XML: Why Individuals Matter." By Edd Dumbill. From XML.com. May 30, 2001. ['A survey of the progress of XML over the last year, emphasizig that in an industry increasingly dominated by large vendors, individual contributors are still key.'] "This article is adapted from the closing keynote speech I [Edd Dumbill] delivered at XML Europe 2001 in Berlin, May 2001. I describe the progress of XML over the last year, emphasizing that in an industry increasingly dominated by large vendors, individual contributors are still key. XML has a tendency to spark new beginnings. Many existing technologies are being re-engineered to take advantage of XML, gaining interoperability benefits previously too costly to realize; industries are finding that XML vocabularies can form a basis for collaboration and cost-cutting, where such cooperation was previously thought counterproductive. XML's influence is proving disruptive to the technological status quo. For better or for worse, many parts of today's computing infrastructure are being re-examined in the light of XML. For better, in that the benefit to be gained from interoperability at the syntax level is large. For worse, in that lessons from the past are being overlooked; however, not learning from history is too broad a charge to lay on the shoulders of overzealous XML developers alone...The progress of adoption and change wrought by XML has accelerated over the last year, but with it comes certain dangers. XML must not be allowed to become so complex that it defeats the point of its original creation and unacceptably raises the level of financial and technological resource needed to use it. A growing reliance on vendor products also runs the risk of creating an identifiable market growth area, which, when it inevitably hits a decline, could take a chunk of XML as a technology down with it. Because of these dangers, the role of individual contributors in the XML community (whether affiliated with a company or not) is more important than ever. They remain among the most creative and influential participants in the development of XML."

  • [June 05, 2001] "XML-Deviant: Schema Scuffles and Namespace Pains." By Edd Dumbill. From XML.com. May 30, 2001. ['W3C XML Schema is complete. End of story? No way! Debates over Schema best practice have dominated XML-DEV over recent weeks.'] "...Kohsuke Kawaguchi posted a reference to an article, XML Schema Dos and Don'ts, which gives his best practice for keeping XML Schemas simple... In response to a message that implied "co-constraints" will be introduced as a feature in XML Schema 1.1 or 2.0, Rick Jelliffe seemed doubtful such functionality would be in XML Schema 1.1. Co-constraints are constraints on instances where the permissible values of one element depend on the value of a different element..."

  • [June 05, 2001] "Xalan. Sun gives translets technology to Apache XML Project. Size and speed seen as major benefits." By Natalie Walker Whitlock (Casaflora Communications). From IBM developerWorks. May 31, 2001. "Sun Microsystems announced that it has donated its proprietary XSLT compiler technology to the open-source Apache XML Project. Part of the Sun XSLT Compiler -- commonly referred to as 'translets' -- will be made available to the nonprofit Apache organization to be incorporated into the Xalan XSLT engine. This technology attracted the interest of many Java/XML developers who learned about it through technical conference discussions and mailing-list exchanges... Typically, the XSLT process involves three parts: an XML file, an XSLT style sheet that describes and directs the transformation, and an XSLT engine that takes both files as inputs and produces the desired transformed output. These traditional XML transformation engines tend to be large and complex programs. The Sun XSLT Compiler takes a novel approach to XSLT processing. With the XSLT Compiler, the transformation is simplified into two pieces. According to David Hofert, Sun XML Technology Development Group leader, the primary step takes an XSLT style sheet as input and produces a Java class as the output. This compiling step takes the style sheet and creates a Java binary class file as an output -- known as a translet. The second step is to apply the translet to any XML files relevant to the style sheet. In addition to boasting small size, sample translets have performed three to ten times better than James Clark's XT transformation engine, according to Sun's testing. Sun attributes the compiler's increased speed to the unique internal representation of the Document Object Model (DOM) used by the translet, and to the fact that the translet is created by writing directly to Java assembler code, which is converted directly into Java byte code..."

  • [June 05, 2001] "What's the 'diff'? Some suggestions for comparing semantic equivalency of XML documents." By Brett McLaughlin (Enhydra Strategist, Lutris Technologies). From IBM DeveloperWorks. May 2001. ['How can you tell whether two XML document are equivalent? Brett McLaughlin explains why answering this common question is more than a trivial task. The explanation shows how to go about comparing XML documents, including how to deal with significant and ignorable whitespace and external entity references. Code samples include DTDs and SAX EntityResolver examples. This article assumes a basic knowledge of XML and a conceptual understanding of SAX.'] "Recently I went about trying to answer a simple question about how to compare XML documents to find out whether they're the same. The answer is not so simple, because it enters the shadowy realm of semantic equivalence... when comparing XML, you're going to want to formulate DTDs that constrain the documents you're comparing as closely as possible. In particular, if an element can contain only other elements, be sure to indicate that in the DTD. That precision will assure that any whitespace in your documents is ignored when working with APIs like SAX, DOM, and JDOM... [Summary:] Now you ought to have a solid understanding of what it means to say that two XML documents are 'the same.' You know why simple programs like diff simply are not enough for comparing XML documents. I hope that you can use some of the code shown here to begin to isolate comparison points in XML documents so that you can more easily perform XML comparisons..."

  • [June 05, 2001] "Translating XML Schema." By Timothy Dyck. In eWEEK (May 28, 2001). "Earlier this month at the Tenth International World Wide Web Conference in Hong Kong, XML took its biggest step forward since the document format was first standardized in February 1998. At the conference, the World Wide Web Consortium released XML Schema as a W3C Recommendation, finalizing efforts that started in 1998 to define a standard way of describing Extensible Markup Language document structures and adding data types to XML data fields. Now that it is finally out, the long-delayed XML Schema standard will catalyze the next big step in XML -- allowing cross-organizational XML document exchange and verification. Just as discovery of the Rosetta stone in 1799 provided a way to fix the meaning of Egyptian hieroglyphs so they could be understood across the gulf of two millennia, XML Schema provides a way for organizations to fix the meaning of XML documents so they can be understood across the gulf of organizational boundaries and otherwise incompatible IT architectures. As a result, XML Schema will be a cornerstone in the new e-commerce architecture that we are collectively building and will be a vital component for making business exchanges and other loose associations of trading partners possible. The arrival of XML Schema, more than three years after XML itself, has left many chafing at the bit (and others, such as Microsoft Corp., running off in their own direction implementing and shipping products based on prestandard efforts), and the market is now more than ready for this standard to take hold. However, XML Schema's long development cycle gave vendors time to understand the specification and start writing compliant software, and we are now seeing the rapid release of XML Schema-compliant (or soon-to-be-compliant) authoring tools and servers... That long, committee-driven development cycle also resulted in a specification that has a bit of everything in it, and fully compliant XML Schema parsers will have to be complex pieces of software to support all the options the specification allows. Fortunately, XML Schema documents have to reference only the functionality they need, and the more complex options in XML Schema, such as null elements and explicit types, may just fade away through disuse. The W3C recently published a recommendation on how to group Extensible HTML, the consortium's replacement for HTML, into well- defined subgroups so XHTML browsers (such as those in cellular phones) can clearly define which parts of the language they support and which they don't. Something similar is a possibility for XML Schema if the full specification proves too difficult to implement for some vendors (although large players such as IBM, Microsoft and Oracle Corp. are moving ahead full speed with plans to support the full specification as published). Over the next few years, eWeek Labs predicts XML Schema will become integral to the way that many companies exchange information..." For schema description and references, see "XML Schemas."

  • [June 05, 2001] "[W3C XML Schema] Speedy Adoption Expected." By Jim Rapoza. In eWEEK (May 28, 2001). "When XML was introduced, although there were early adopters, it still took about a year before Extensible Markup Language began to be regularly used in enterprise- level applications and deployments. Now that XML Schema is a standard, the waiting period for its adoption should be much shorter. Part of this can be attributed to how long businesses have been waiting for this schema. Many have been working on tools and compatibility issues while the standard was under development. However, it is also due in part to the complexity of the schema. Whereas the initial XML standard could be easily built and managed by anyone with an editor, many vendors plan to provide new tools to help shield users from the size and complexity of XSD (XML Schema Definition). Given the importance of XML Schema for handling data-driven communications among businesses, eWeek Labs recommends that developers begin evaluating tools that will help them move to XSD. In addition, companies should find out what their enterprise software vendors' plans are for supporting and integrating with XML Schema. As is true of most standards, many of the initial sets of XML Schema tools are essentially validators that help developers stay within the standard. Several are from individual World Wide Web Consortium members and universities, but some are also available from vendors such as IBM, and Java-based validators are available from Sun Microsystems Inc... Another important set of tools for businesses moving to XML Schema are conversion tools, which will help develop-ers convert content to the new standard. Probably the most important will be tools for converting standard XML DTDs (Document Type Definitions) to XSD, although some of those currently available have not been updated to the final standard. There are also tools for converting files from other schema languages, including a tool from Microsoft Corp. for converting files from XML Data Reduced to XSD...Microsoft recently released betas of MSXML and SQLXML that support the schema and has said that most of its products will support XSD in their next versions. Sun has released a new XML data types library that supports the final XML Schema standard, and Tibco Software Inc. includes tools for validating documents using XSD...

  • [June 05, 2001] "Other XML Standards Get Ready to Roll." By Jim Rapoza. In eWEEK (May 28, 2001). "As XML has progressed down the technological road since its introduction in 1996, it has steadily gained momentum, to the point where most other World Wide Web Consortium standards are now based on Extensible Markup Language. But for the last two years, the giant, wide-body truck that has slowed its progress has been the development of XML Schema as a standard. Now that the W3C has finally gotten XML Schema into gear, what's next for XML? eWeek Labs believes several core XML technologies will probably become standards (or Recommendations, as the W3C calls them) this year and, for the most part, all will help improve the interoperability of XML-based data and applications. Also, not surprisingly, most of these related technologies were initially proposed around the same time as XML Schema. The XML Information Set, which is expected to reach recommendation status next month, will provide a common reference set for defining abstract objects such as elements within a document. The main goal here isn't to provide a definitive set of definitions but to provide a base that will improve interoperability among XML tools and applications. Later this year, several technologies pertaining to XML linking -- Xlink, Xbase and Xpointer -- should become standards or reach candidate status. All these technologies deal with hyperlinking within XML documents, in a manner similar to the way Uniform Resource Indicators work. All three will enable a much more complex and multilayered linking than what is currently possible in HTML and XML. Whereas the other technologies listed here have been around for almost two years, XML Query was introduced just this year and is probably at least a year away from becoming a standard...

  • [June 05, 2001] "Using Schema and Serialization to Leverage Business Logic." By Eric Schmidt. From Microsoft MSDN Online. 'Extreme XML' Column. May 17, 2001. ['New columnist Eric Schmidt addresses how you can use schemas and serialization technology to leverage XML in your applications and services.'] "In this issue of Extreme XML, we are going to examine the importance of schema usage and the use of serialization technology to leverage XML in your applications and services. The majority of development tasks today revolve around developers taking existing infrastructure (business components, databases, queues, and so on) and morphing them into the next version of their product... The surge of XML usage over the past several years has not led to a complimentary increase in defined data models for XML documents. For this section, I am referring to a data model for XML to be the structure, content, and semantics for XML documents. The one main reason for this slow growth in XML data models is the lack of, until now, a robust XML schema standard. Document Type Definitions (DTDs) have out grown their usefulness in the enterprise space because of their focus on XML from a document perspective and not viewing XML document instances from a data and type perspective. Typed data items like addresses, line items, employees, orders, and so on have complex models and are the basis for most applications. Applications look at data from strongly typed perspective. For example, a Line Item is an inherited member of an order and contains typed information like product price, which is of type currency. The majority of this type of modeling cannot be accomplished with DTDs. Due to the simple structuring and typing mechanisms in DTDs, numerous XML validation, structuring, and typing systems have been created, including Document Content Description (DCD), SOX, Schematron, RELAX and XML-Data Reduced (XDR). The later, XDR, has gained much momentum in the Windows and B2B based communities due to its usage in products like SQL Server, BizTalk Server, and MSXML. In addition, most independent software vendors (ISVs) and B2B integrators support XDR because of its data typing support, namespace support, and its XML-based language. However, XDR's usefulness stills falls short of providing a truly extensible modeling and typing system for complex data structures. This was a known issue at the time of XDR's creation. Building on the lessons learned from previous schema implementations, the W3C XML Schema working group set out to create a specification (XML Schema) for defining the structure, content, and semantics of XML documents. Ultimately, this specification should provide an extensible environment so that it could be applied to any type of business or processing logic. During the development of this article, I was pleased to see that the W3C released XML Schema as a recommendation. This is a tremendous step in solidifying and stabilizing XML-based implementations that need to employ schema services. Next, we're going to look at the importance and power behind XML Schema... I have distilled five core items you need to know about XML Schema so you can get up and running: (1) XML Schema is represented in XML 1.0 syntax; this makes parsing XML Schema available to any XML 1.0-compliant parser, and thus can be used within a higher-level API like the DOM. (2) Data typing of simple content: XML Schema provides a specification for primitive data types (string, float, double, and so on) found in most common programming languages. (3) Typing of complex content: XML Schema provides the ability to define content models as types. (4) Distinction between the type definition and instance of that type: unlike XDR, XML Schema type definitions are independent of instance declarations; this makes it possible to reuse type definitions in different contexts to describe distinct nodes within the instance document. (5) W3C support and industry implementation... creating specific and lucid schema should be your first task when creating XML- and Web Service-enabled applications. If your partners need other schema definitions than XML Schema, for example DTD, start with an XML Schema approach and then port the implementation. You'll come out ahead in the long run." See also the sample code for the article. On XML schemas: "XML Schemas."

  • [June 05, 2001] "Web Team Talking: Out of Cache but Still Stylin'." By Mark Davis, Heidi Housten, Dan Mohr and Kusuma Vellanki. From Microsoft MSDN Online. June 4, 2001. ['This month the team serves up a new twist on the ever popular question of how to avoid caching, as well as some advice on using XSL to display XML data with different fields every time.'] "... We have a new twist on the ever popular question about how to avoid caching and an answer on using XSL to display XML data with different fields every time..." Covers: (1) a way in which to prevent Internet Explorer from putting a dynamically changed XML document in the cache; (2) displaying unknown XML data in a table; (3) pop-up window notifications; (4) XML object model center spread. See also "The Revised XML Object Model for Internet Explorer 5.0" ('an updated version for MSXML 3.0 is on its way to MSDN as we speak...')

  • [June 04, 2001] "A Triumph of Simplicity: James Clark on Markup Languages and XML. Markup Languages, the Standardization Process, and the Importance of Simplicity. [DDJ Interviews James Clark. Feature.]" By Eugene Eric Kim and James Clark. In Dr. Dobb's Journal Issue 326 (July 2001), pages 56-60. ['Whether you know it or not, James Clark has made your life easier by creating a number of open-source tools such as expat (an XML parser), groff (a GNU version of troff), TREX (an XML schema language), and more. Eugene Eric Kim talks to James about these tools, plus the state of XML.'] "If you peek under the hood of high-profile open-source projects such as Mozilla, Apache, Perl, and Python, you'll find a little program called 'expat' handling the XML parsing. If you've ever used the man command on your GNU/Linux distribution, then you've also used groff, the GNU version of the UNIX text formatting application, troff. If you've ever done any work with SGML, from generating documentation from DocBook to building your own SGML applications, you've undoubtedly come across sgmls, SP, and Jade. Whether you've heard of him or not (and mostly likely, you haven't), James Clark [pictured] has made your life easier. In addition to authoring these and other widely used open-source tools. Clark served as the technical lead of the original W3C XML Working Group and as the editor of the XSLT and XPath recommendations. He recently founded Thai Open Source Software Center. His latest project is TREX, an XML schema language. Clark sat down with Eugene Eric Kim to discuss markup languages, the standardization process, and the importance of simplicity... [The next step for XML?] JC: I think XML has become so widespread, it's like asking me, 'What's the next application for ASCII text? What's the next application for line-delimited files?' XML is becoming so common, it's not interesting anymore. One of the things that I was very inspired by in working with TREX was a project from the University of Pennsylvania called XDuce, which is an XML processing language. One thing that is interesting about XDuce is that it uses the type information from DTDs to actually type-check your program. Statically typed languages, like Java and C++, help you catch a lot of errors. But with XML processing at the moment, you use the DTD just to validate the file. You don't really use the type information after that. The fact that a document conforms to a DTD is not used by the typing system of the programming languages. I think one interesting direction is to try doing the kind of things that XDuce is doing, which is integrate the type system of your data, DTDs or schemas, into the type system of the programming language. You want them to all work together in a seamless way so that your compiler can catch a lot more errors when you write programs to process XML, so you can get more reliable programs..." Note: With the decision to merge RELAX Core and TREX under the name 'RELAX NG', we may assume that much of what Clark writes about TREX applies largely to RELAX NG as well. E.g., "...You can think of it as DTDs in XML syntax minus some things and plus some others. TREX just does validation. DTDs mush together both validation and interpretation of the documents, providing various things like entities and notations. Mushing them together is problematic because often you want one thing but not the other. My work with XML and SGML has convinced me that what you need is good separation between these different things. I wanted to remove from DTDs the things that augment the information in the XML document. And I wanted to add in some of the things that I think XML DTDs have always been missing. One of the things XML DTDs removed from SGML DTDs was AND groups, which allow you to have unordered content. The SGML AND groups had a bad reputation, and don't have quite the right semantics. TREX adds them back and tries to do them right. XML also radically simplified the kinds of mixed content that you're allowed because there's a problem with the way SGML does it. Instead of restricting it, TREX solves the problem..." See: "RELAX NG."

  • [June 02, 2001] "Bringing the Wireless Internet to Mobile Devices." By Subhasis Saha, Mark Jamtgaard, and John Villasenor. In IEEE Computer Volume 34, Number 6 (June 2001), pages 54-58. "Transcoding and Relational Markup Language are promising middleware solutions to the problem of bringing Internet content to the extremely diverse and dynamic mobile wireless devices universe... Mapping Internet content to mobile wireless devices requires new technologies, standards, and innovative solutions that minimize cost and maximize efficiency. The wireless Internet must deliver information in a suitable format to handheld device users -- regardless of location and connectivity. Although the exact form in which high-speed wireless data services will develop is uncertain, the authors predict an improvement over today's data rates. Current mobile devices suffer from small displays, limited memory, limited processing power, low battery power, and vulnerability to inherent wireless network transmission problems. To address these issues, a group of leading wireless and mobile communications companies have developed the wireless application protocol for transmitting wireless information and telephony services on mobile handheld devices. Whereas HTTP sends its data in text format, WAP uses Wireless Markup Language to create and deliver content in a compressed binary format that provides efficiency and security. Middleware, an alternative to manually replicating content, seamlessly translates a Web site's existing content to mobile devices that support operating systems, markup languages, microbrowsers, and protocols. The authors predict that middleware such as Relational Markup Language will be critical to bringing Internet content to wireless devices, and they anticipate that open standards based on this or similar techniques will gain acceptance..." See also "Relational Markup Language (RML)."

  • [June 02, 2001] "XML's Impact on Databases and Data Sharing." By Len Seligman and Arnon Rosenthal (of MITRE Corporation). In IEEE Computer Volume 34, Number 6 (June 2001), pages 59-67. [Research Feature.] "The Extensible Markup Language, HTML's likely successor for capturing Web content, has generated a lot of interest. Created by the World Wide Web Consortium to address HTML's limitations, XML resembles HTML's format but offers users a more extensible language. It lets information publishers invent their own tags for applications. Alternatively, they can work with organizations to define shared tag sets that promote interoperability and help separate content from presentation. While XML addresses content, Cascading Style Sheets, the Extensible Stylesheet Language, and Extensible HTML handle presentation separately. XML also supports data validation. XML's advantages over HTML include support for multiple views of the same content for different user groups and media; selective, field-sensitive queries over the Internet and intranets; a visible semantic structure for Web information; and a standard data and document interchange infrastructure. Using XML and related tools often eliminates problems associated with heterogeneous data structures. Like any new technology, XML has generated exaggerated claims. It does not come close to eliminating the need for database management systems or solving large organizations' data-sharing problems. Although XML hype has raised unrealistic expectations, the language does reduce the data-sharing obstacles among diverse applications and databases by providing a common format for expressing data structure and content... Some industry observers have heralded XML as the solution to data-sharing problems -- for example, one observer asserted that XML together with XSL will bring -- complete interoperability of both content and style across applications and platforms. In reality, XML technologies will contribute only indirectly to meeting many of the toughest data-sharing challenges. Architectures Users want seamless access to all relevant information about their domain's real-world objects. Several general architectures and hybrids are available for this purpose... Regardless of the distributed architecture chosen, someone -- a standard setter, application programmer, or warehouse builder -- must reconcile the differences between data sources and the consumer's view of that data so users can share it. This reconciliation must insulate applications from several forms of diversity. The insulation mechanisms also provide an interface for programmers to look beneath and see the diversity. XML's contributions to data sharing [include]: (1) Level 1: Geographic distribution. Data can be widely distributed geographically. Off-the-shelf middleware products handle most of the challenges at this level, often supporting standard protocols such as HTTP, the simple object access protocol (SOAP), or the common object request broker architecture. XML assists with remote function invocation. (2) Level 2: Heterogeneous data structures and languages. Diversity here includes different data-structuring primitives -- such as tables versus objects -- and data manipulation languages -- such as SQL versus a proprietary language versus file systems with no query language. XML provides a neutral syntax for describing graph-structured data as nested, tagged elements with links. Because developers can transform diverse data structures into such graphs, XML -- along with DOM and XQuery -- provides the operations users need to access these heterogeneous data structures. (3) Level 3: Heterogeneous attribute representations and semantics. This level deals with atomic concepts. Transmitting a fact between systems requires relating each system's semantics as well as their representations. The computer does not need to 'understand' either the source or target concept; rather, it only needs to know whether they are identical or how to convert them. XML provides a convenient mechanism for attaching descriptive metadata to both source and target schemas' attributes. (4) Level 4: Heterogeneous schemas. Developers are increasingly aware that schema diversity will be a serious problem even if XML schemas achieve wide usage. To support interoperability at this level, a way to describe and share community schemas and to express mappings across schemas is necessary. Communities developing standard schemas include e-commerce, healthcare, and data-warehousing vendors. Such schemas will reduce diversity among interfaces and ease data sharing. Oasis and BizTalk are examples of XML repository environments that map among XML elements and models. XML does not provide intrinsically simpler model standardization than object systems, but its ubiquity and cheap tools have sparked enthusiasm, motivating some communities to agree on standards when previously they could not. (5) Level 5: Object identification Improvements in describing attribute representation and semantics can remove one source of object misidentification -- for example, is the date in a payment in US or European format? Also, XML makes it easy to attach uncertainty estimates as subsidiary elements to any output -- although to be useful, the recipient must be prepared to interpret them. (6) Level 6: Data value reconciliation. Many strategies for data value reconciliation depend on having metadata such as time stamp and source quality attached to the data. In addition to attaching such annotations, XML makes it easy to return a set of alternative elements for an uncertain value if the recipient can use such output..." See the related paper from Mitre online; [cache]

  • [June 02, 2001] "Middleware Challenges Ahead." By Kurt Geihs (Goethe University). In IEEE Computer Volume 34, Number 6 (June 2001), pages 24-31. "New application requirements -- including the need to support enterprise application integration, Internet applications, quality of service, nomadic mobility, and ubiquitous computing -- challenge established middleware design principles. Meeting these challenges will lead to a major middleware design and development phase that requires new insights into distributed system technology. A middleware layer seeks to hide the underlying networked environment's complexity by, for example, insulating applications from explicit protocol handling, disjoint memories, data replication, network faults, and parallelism. Middleware masks the heterogeneity of computer architectures, operating systems, programming languages, and networking technologies to facilitate application programming and management... Asynchronous interaction: Independent from any particular communication style, distributed programming models such as RPC and the later remote object invocation (ROI) are natural companions for client-server applications. These programming models introduce a synchronous, blocking interaction style in which a server object remains passive until it receives a request, and the system blocks the client's execution until the server response arrives. Distributed programming models hide distribution because the transaction looks like a local procedure call, and they elegantly handle the implicit synchronization. RPC and ROI remain middleware's most popular communication models. Obvious drawbacks occur if the client uses the network environment's inherent parallelism, for example, to send a search request in parallel to several directory services. RPC-style communications offer two choices: Either use multithreading and spawn a separate thread per request or use a modified non-blocking RPC facility. The RPC system's inherently sequential interaction style has received some criticism... For Internet applications, the simple object access protocol defines a mechanism for transporting invocations between peers using HTTP or other protocols and XML as the interface description and encoding language. SOAP does not prescribe any particular programming model. SOAP implements patterns such as request-response pairs as one-way transmissions from a sender to a receiver. Developers designed SOAP to correspond with the Internet's need for a lightweight, open, and flexible mechanism for linking arbitrary applications and services. Event-based middleware architectures address the requirement for decoupled, asynchronous interaction in large-scale, widely distributed systems. Using events as the primary means of interaction allows asynchronous, peer-to-peer notifications between objects and provides flexible pattern-based event filtering and forwarding options.9 Message passing accommodates peer-to-peer interaction because it has weaker coupling and better scalability. However, in terms of programming abstractions, this low-level paradigm makes programming potentially more error-prone and more difficult to test and debug for elaborate communication patterns. Thus, we can view message passing as a backward step in middleware evolution that illustrates the design trade-off between degree of abstraction and practical requirements."

  • [June 01, 2001] "InfoWorld Readers' Choice Awards." By [Staff.] In InfoWorld (June 01, 2001). [Announcement: "InfoWorld Announces 2001 Readers' Choice Awards. XML Selected As Standard of The Year."] "...Finalists for the awards were nominated by InfoWorld editors, writers, and analysts, and then readers were asked to vote online for their favorites. To make sure that the results were unbiased and unsullied by vote tampering, we asked voters to use their subscription numbers to identify themselves, and each subscription number could vote only once. InfoWorld readers are known for their technological acumen, and subsequently the results of the voting are very revealing. Some choices you made were resounding and clear, but others in more detailed technical categories were close, with winners decided by only a fraction...XML won the standards battle with ease, gaining recognition for Most Important Standard of the Year with 59 percent of the vote, beating out Java 2 Enterprise Edition (J2EE) with 22 percent in second place, and Application Development Technology of the Year with 39 percent over J2EE again, which garnered a much closer 31 percent. Not surprisingly, J2EE won an award itself for Infrastructure Product of the Year, beating out Cisco's Long-Reach Ethernet technology. Other clear winners you voted for included Verio as ASP (application service provider) of the Year, which received 48 percent of the vote, 30 percent clear of the following pack. However, ISP of the Year was a closer call: AT&T WorldNet garnered 35 percent of your votes to 31 percent for UUNet and a surprising 23 percent for America Online. Your pick for Hosting Center of the Year was also a resounding choice: Qwest with 37 percent of the vote, trailed by Exodus at 21 percent..."

  • [June 01, 2001] "Web services unite tech giants ... somewhat." By Matt Berger. In InfoWorld (June 01, 2001). "Companies that for the most part have agreed to disagree appear to be making an exception when it comes to Web services, an emerging computing model that seems to be changing its definition as fast as it gathers new support. While they engaged in some of the usual corporate head-butting, representatives from Hewlett-Packard, Microsoft, Sun Microsystems and IBM found time for moments of accord during a panel discussion at Partech International's Web Services Conference here Thursday. At the heart of their agreement was a set of technology standards that the rivals agree will be central to the next stage of Internet computing. Still largely a concept, Web services describes a computing model in which information can be pulled together over the Internet from a variety of sources and assembled, on the fly, into services that are useful to businesses and consumers. In some cases the information being accessed is itself a kind of service, becoming a building-block component such as a shared online calendar that can be integrated into a larger service offering.... While each one pitched its platform as the best foundation for Internet-based applications and services, the four vendors made it clear that the Web services idea won't work without the broad adoption of technologies including XML (extensible markup language), UDDI (universal description, discovery and integration) and SOAP (simple object access protocol). So far, there has been little resistance. 'This is all just beginning to take shape,' said Ben Brauer, product marketing manager for the Web services division at Hewlett-Packard, who has worked on the development of UDDI. 'We all believe that standards are evolving more quickly than standards in the past because there is so much industry backing.' But while the vendors appear to be in agreement on basic standards, there's room for trouble yet. For example, XML comes in a variety of different formats, or 'schema,' depending on what it's being used for, and there's room for divergence from many of the agreed-upon standards at a deeper technical level, analysts said. 'There are standards, but they are the generic standards,' said Tim Clark, an analyst with Jupiter Media Metrix. The building-block standards used to create Web services can actually be very proprietary, he said, and it's also not clear yet how coding languages such as Microsoft's C# and Sun's Java will exist side by side..."

May 2001

  • [May 31, 2001] "An XML Encoding of Simple Dublin Core Metadata." Edited by Dave Beckett, Eric Miller, and Dan Brickley. Dublin Core Metadata Initiative Proposed Recommendation. 2001-04-11 or later. Dublin Core Metadata Initiative Proposed Recommendation. This version supersedes http://dublincore.org/documents/2000/07/14/dcmes-xml/. "The Dublin Core Metadata Element Set V1.1 (DCMES) can be represented in many syntax formats. This document explains how to encode the DCMES in XML, provides a DTD to validate the documents and describes a method to link them from web pages... This document describes an encoding for the DCMES in XML subject to these restrictions: (1) The Dublin Core elements described in the DCMES V1.1 reference can be used; (2) No other elements can be used; (3) No element qualifiers can be used; (4) The resulting XML cannot be embedded in web pages. The primary goal for this document is to provide a simple encoding, where there are no extra elements, qualifiers, optional or varying parts allowed. This allows the resulting data to be validated against a DTD and guaranteed usable by XML parsers. A secondary goal was to make the encoding also be valid RDF which allows the document to be manipulated using the RDF model. We have tried to limit the RDF constructs to the minimum, and the result is a mostly standard header and footer for every document. We acknowledge that there will be further documents describing other encodings for DC without these restrictions however this one is for the simplest possible form. One result of the restrictions is that the encoding does not create documents that can be embedded in HTML pages..." See: "Dublin Core Metadata Initiative (DCMI)."

  • [May 31, 2001] "BEA Next Up to Outline Web Services Strategy." By Kathleen Ohlson. In InfoWorld (May 29, 2001). "Joining the likes of IBM, Microsoft, and Sun, BEA Systems next week is expected to map out a Web services strategy that would provide enhanced access to and interaction with business functions over the Internet. According to sources, BEA is going to announce that portal functionality and support for Java Messaging Service (JMS), a specification that details how applications communicate in an asynchronous environment, will be added to BEA's flagship Java 2 Enterprise Edition (J2EE)-compliant application platform called WebLogic Server. JMS lets messages be sent between applications across a network, and this version of JMS would let messages pass from one ERP (enterprise resource planning) application to another. The Web services blueprint is expected to coincide with BEA's WebLogic Server upgrade announcement. WebLogic Server 6.1 will also include support for UDDI and WSDL (Web Services Description Language), according to BEA. UDDI is a universal registry of resources, and WSDL standardizes the way services and their providers are described. The application server would act as the backbone for BEA Web services... BEA previously has said its strategic products will consist of WebLogic Collaborate, its collaboration platform that integrates trading partners and e-business processes over the Web, and WebLogic Process Integrator, the workflow engine for Collaborate that controls the sequences of Web services. The products in BEA's Web services lineup will support UDDI, WSDL, SOAP (Simple Object Access Protocol), and ebXML (electronic business XML). SOAP exchanges XML-based messages from one business application to another over the Web, and ebXML creates a standard XML dialect for businesses to find each other on the Web, form trading partner deals, and exchange business documents electronically. The company also supports BTP (Business Transaction Protocol) in WebLogic Collaborate, which defines how to do transactions, security, and multiparty dialog in Web services. For example, Web transactions could be canceled without any changes to corporate systems if the receiving application did not get all the XML data..." See the announcement: "Market Leader BEA Systems To Showcase the BEA WebLogic E-Business Platform Advancements, New Partners and Customers at JavaOne. Developer Conference BEA CEO Bill Coleman to Deliver JavaOne Keynote Address on June 7, 2001."

  • [May 31, 2001] "IBM Retools For Web Services." By Wylie Wong. In CNET News.com (May 28, 2001). "IBM is set to launch Tuesday its latest offensive in the market for e-business software with more versatile development tools. The company will announce further details of the next releases of its application-server software and new development tools for building Web-based software and services. IBM competes against BEA Systems, Microsoft, Oracle, Sun Microsystems and others in the market for e-business software that enables companies to share data and conduct trades online. Analysts say IBM's latest WebSphere application server, which will ship late next month, will help the company compete against market leader BEA, which holds the top spot in the market for application servers. In the $1.6 billion market in 2000, BEA captured 35 percent of the market share, followed by IBM with 30 percent, according to analyst firm Giga Information Group. The most important change, [Evan] Quinn [Hurwitz Group] said, is that every version of IBM's WebSphere application server is now built on the same software code. Previously, each version of WebSphere, from low end to high end, was built using slightly different code, making it harder for businesses to move to higher-end versions of the server as their needs grew. .. The new WebSphere application server version 4.0 will also support additional Web standards that allow people to build Web-based software and services. IBM, along with its rivals Microsoft and others, has been racing to build and sell software for building and delivering Web services by which people access software through the Web instead of on their local PCs. IBM had previously announced plans to support Web services throughout its e-business product family, including its DB2 database-management software. IBM's new database with support for Web services is expected to be released early next month... IBM on Tuesday also announced a new version of its Visual Age for Java and WebSphere Studio software development tools, which IBM executives said will offer better support for Web services and the Java 2 Enterprise Edition. The company will also release in July a free tool, called the WebSphere Studio WorkBench, which allows software developers to integrate IBM's development tools with other company's development tools and have one user interface on their computers for writing applications..."

  • [May 25, 2001] "Indexing XML Documents. [XML Matters, Part #10.]" By David Mertz, Ph.D. (He-Of-Innumerable-Epithets e.g., 'Objectifier,' Gnosis Software, Inc.) From IBM developerWorks. May 2001. ['As XML document storage formats become popular, especially for prose-oriented documents, the task of locating contents within XML document collections becomes more difficult. This column extends the generic full text indexer presented in David's Charming Python #15 column to include XML-specific search and indexing features. This column discusses how the tool design addresses indexing to take advantage of the hierarchical node structure of XML.'] "Large multi-megabyte documents consisting of thousands of pages are not uncommon in corporate and government circles. Writers and technicians routinely produce voluminous product specifications, regulatory requirements, and computer system documentation in SGML (Standard Generalized Markup Language) format. In a technical sense, XML is a simplification and specialization of SGML. At a first approximation then, XML documents should also be valid SGML documents. Culturally, however, XML has evolved from a different direction. In one respect, XML is a successor for EDI. In another respect, it is a successor for HTML. Having a different cultural history from SGML, XML is undergoing its own process of tool development. It is becoming more popular, so expect to see more and more of both (usually) informal HTML documents and (usually) formal SGML documents migrating in the direction of XML formats -- particularly using XML dialects like DocBook. However, XML has not yet grown, within its own culture, a tool that effectively and efficiently locates content within large XML documents. General file-search tools like grep on Unix, and similar tools on other platforms, are perfectly able to read the plain text of XML documents (except for possible Unicode issues), but a simple grep search (or even a complicated one) misses the structure of an XML document. When searching for content in a file containing thousands of pages of documentation, you are likely to know much more than you can specify in just a word, phrase, or regular expression. Just which of those agricultural reports, for example, did Ms. June Apple write? A coarse tool like grep will generally find a lot of things that are not of interest. Moreover, ad hoc tools like grep, while very efficient at what they do, need to check the entire contents of large files each time a search is performed. For frequent searches, repeated full-file searching is inefficient... In response to the need outlined above, I have created the public-domain utility xml_indexer. This Python module can be used as a runtime utility and can also be easily extended by custom applications that use its services. The module xml_indexer, in turn, relies on the services of two public-domain utilities I have described in earlier IBM developerWorks articles: indexer and xml_objectify... It turned out that the design of xml_indexer was aided enormously by the object-oriented principles that went into designing indexer. Overriding just a few methods in the GenericIndexer class (actually, in its descendent SlicedZPickleIndexer -- but one could just as easily mix in any concrete Indexer class), made possible the use of an entirely new set of identifiers and data source. Readers who wish to use xml_indexer as part of their own larger Python projects should find its further specialization equally simple." Article also available in PDF format. See: "XML and Python."

  • [May 24, 2001] RELAX NG Tutorial." Edited by James Clark [for the TREX TC]. Draft/Version: 2001-05-25. [Attached is a RELAX NG tutorial based on my TREX tutorial.] "RELAX NG is a simple schema language for XML, based on RELAX and TREX. A RELAX NG schema specifies a pattern for the structure and content of an XML document. A RELAX NG schema thus identifies a class of XML documents consisting of those documents that match the pattern. A RELAX NG schema is itself an XML document... RELAX NG Non-features: The role of RELAX NG is simply to specify a class of documents, not to assist in interpretation of the documents belonging to the class. It does not change the infoset of the document. In particular, RELAX NG does not allow defaults for attributes to be specified, does allow entities to be specified, does allow notations to be specified, [and] does not specify whether white-space is significant. Also, RELAX NG does not define a way for an XML document to associate itself with a RELAX NG pattern." Note section 17, 'Differences from TREX': "(1) the concur pattern has been removed; (2) the string pattern has been replaced by the value pattern; (3) the anyString pattern has been renamed to text; (4) the namespace URI is different; (5) pattern elements must be namespace qualified; (6) anonymous datatypes have been removed; (7) the data pattern can have parameters specified by param child elements; (8) oneOrMoreTokens and zeroOrMoreTokens patterns have been added for matching whitespace-separated sequences of tokens; (9) the data pattern can have a key or keyRef attribute; (10) the replace and group values for the combine attribute have been removed; (11) an include element in a grammar may contain define elements that replace included definitions." Note: RELAX Core and TREX (Tree Regular Expressions for XML) are to be unified, since the two are very similar as structure-validation languages. The unified TREX/RELAX language will be called RELAX NG [for "Relax Next Generation," pronounced "relaxing"]. This design work is now being conducted within the OASIS TREX Technical Committee, where a (first) specification is expected by July 1, 2001. The OASIS TC has also been renamed 'RELAX NG' [mailing list: 'relax-ng@lists.oasis-open.org'] to reflect the new name of the unified TREX/RELAX language. The RELAX NG development team plans to submit the OASIS specification to ISO, given the importance of ISO standards in Europe. See the RELAX NG Issues List of 2001-05-24 for updates on the design of RELAX NG. References: see (1) Tree Regular Expressions for XML (TREX), and (2) REgular LAnguage description for XML (RELAX).

  • [May 24, 2001] "Components in tag land. Where components fit into the picture at XML DevCon." By Uche Ogbuji (Fourthought, Inc). From IBM developerWorks. April 2001. ['XML DevCon is, of course, all about XML. But since it's geared towards developer education, component technologies from COM to CORBA and beyond are inevitable parts of the picture. At XML DevCon Spring 2001 it seemed everyone wanted a piece of the emerging field of Web services. Uche Ogbuji reports from the front lines, sorting out the fresh meat from the vapor.'] "One thing about the grand term component is that its most clear usage is as a generic qualifier for one vendor to use in proclaiming its product as superior to the competition. Beyond that, it has never been clear what the term means. XML DevCon saw an emergence of a successor term in this regard: Web services. At the conference, everything was a Web service, and everyone was a Web services specialist. A Web service is basically a component that is designed to be accessed using Web technology. In most cases this involves XML messaging to HTTP servers. SOAP is the usual transport protocol, somewhat equivalent to CORBA's IIOP, say, or EJB's RMI. Web services has sprouted a bunch of other analogs to traditional component tools. WSDL is similar to IDL, and UDDI is similar to the CORBA naming and trader service, EJB's JNDI, or COM interface registries. Although I make fun of the woolliness of the term Web services, there is no doubt that Web services are serious business. There was clearly a great deal of money, promotion and developer effort going into the many Web services systems on display. And everyone gave the impression that the stakes in the poker game are quite high. One central conflict was between ebXML, which approaches Web services as an enhancement to traditional EDI, and another camp -- headed by IBM and Microsoft -- that revolves around UDDI and other newly minted technologies. There was an ebXML day and a UDDI day, and the partisans of each faction could be heard disparaging the other based on its standards credentials, openness, or lack of practical implementation. This central debate is very likely to move to center stage in the world of business components because XML has pretty much been accepted as the central glue that will tie macro business components (such as EDI tools) to micro business components (such as your favorite online shopping cart widget). At their core, both UDDI and ebXML have the idea (from component technologies) of a repository of interfaces to components offered on a system. This is similar to a CORBA interface repository or a COM interface registry, except that Web services registries have the potential to store millions of entries representing the global facilities that will be offered as Web services. Clearly such a wide-ranging directory of Web services would require an accessible way to manage metadata -- such as the network location and cost of the service -- and the composition of requests to, and responses from, the service. UDDI and ebXML take advantage of XML's general usefulness in representing metadata (although they don't take advantage of RDF: XML's most powerful tool for this), and they use UML for formal expression of the metamodel. Strangely enough, if they insist on using UML, you'd think they'd also use XMI for the XML representation. But as we all know, component technology vendors like to tout the concept of reuse even though they themselves are guilty of forgetting to practice this gospel..." Article also in PDF format.

  • [May 24, 2001] "IBM Customers Generally Bullish About Web Services." By Kathleen Ohlson. In Network World Volume 18, Number 21 (May 21, 2001), page 12. "At a press conference in New York last week IBM outlined Dynamic E-Business, a grand plan to help companies build applications that pull together information from multiple sources either internally, or externally over the Internet. For its Web services plan, IBM intends to make the most of its WebSphere, DB2, Tivoli and Lotus products, and improve them with support for emerging standards such as Simple Object Access Protocol (SOAP); Universal Description, Discovery and Integration (UDDI); and Web Services Description Language (WSDL). SOAP exchanges XML-based messages from one business application to another over the Internet. UDDI is a universal registry of resources, and WSDL standardizes how a service and its provider are described. IBM joins industry heavyweights such as Microsoft, Sun and Oracle, in the Web services arena. Users have high hopes for the iniative... Dave Kulakowski says Honeywell would use IBM Web services within the company until the technology standards become stabilized and more companies implement some form of Web services. Besides Honeywell, companies including Galileo International, He-witt Associates, Duck Head Ap-parel, CareTouch and Transacttools are looking into IBM's Web services and plan to implement them within a year. Sam Johnson, CEO of Transacttools, says trading partners trying to connect to each other's legacy systems could go through an XML interface, rather than connecting to individual systems such as equity, settlement and fixed income in the financial community. Transacttools, a financial services application service provider in New York, will roll out IBM Web services to customers including J.P. Morgan, Capital International and Instanet. Tim Hilgenberg, CTO at human resources consulting firm Hewitt Associates, says Web services, in general, have the potential to prevent customers from being locked into one vendor. 'Web services are like Switzerland,' because they're nonproprietary, so 'customers won't have these pocketed islands and have to create the connectivity to reach these islands, which costs a lot,' he says..." See the "IBM Global Services and IBM WebSphere Platform to Support IBM's Web Services Infrastructure."

  • [May 24, 2001] "Using the Jena API to Process RDF. [Tutorial.] By Joe Verzulli. From XML.com. May 23, 2001. ['Jena is a freely-available Java API for processing RDF. This article provides an introduction to the API and its implementation.'] "There has been growing interest in the Resource Description Framework (RDF) and a number of tools and libraries have been developed for processing it. This article describes one such library, Jena, a Java API for processing RDF. It is also the name of an open source implementation of the API.'] XML is very flexible and allows information to be encoded in many different ways. If meaningful tag names are used it is relatively easy for a person to determine the intended interpretation of an XML string. However, it is difficult for programs to determine the intended interpretation since programs don't understand English tag names. DTDs and XML Schemas don't really help in this regard. They just allow a program to verify that XML strings conform to some set of rules. RDF is a model and XML syntax for representing information in a way that allows programs to understand the intended meaning. It's built on the concept of a statement, a triple of the form {predicate, subject, object}. The interpretation of a triple is that <subject> has a property <predicate> whose value is <object>... RDF requires that different kinds of semantic information (e.g., subjects, properties, and values) be placed in prescribed locations in XML. Programs that read an XML encoding of RDF can then tell whether a particular element or attribute refers to a subject, a property, or the value of a property. Jena was developed by Brian McBride of Hewlett-Packard and is derived from earlier work on the SiRPAC API. Jena allows one to parse, create, and search RDF models. Jena defines a number of interfaces for accessing and manipulating RDF statements..." [Website description: "Jena is an experimental java API for manipulating RDF models. Its features include: (1)statement centric methods for manipulating an RDF model as a set of RDF triples (2) resource centric methods for manipulating an RDF model as a set of resources with properties (3) cascading method calls for more convenient programming (4) built in support for RDF containers - bag, alt and seq (5) enhanced resources - the application can extend the behaviour of resources (6) mulptiple implementations (7) integrated parser (David Megginson's RDFFilter). An alpha quality implementation is available for download."] See "Resource Description Framework (RDF)."

  • [May 24, 2001] A Web Less Boring. [Talks.]" By Edd Dumbill. From XML.com. May 23, 2001. ['Tim Bray condemned the state of web browser technology, saying it was responsible for making the Web dull, in his opening keynote at XML Europe 2001 in Berlin.'] " In his opening keynote at XML Europe 2001 in Berlin, Tim Bray explained how XML could make the Web more interesting -- specifically, the Web's user interface. Bray recounted that many members of the original team that created XML envisaged its application in web-enabled client document rendering systems, providing flexible user interfaces for exploring content. Instead XML seems to have found its immediate application in the backroom, connecting databases and disparate server systems. One of the most well-known uses of XML in this scenario is the SOAP protocol, which allows message passing between applications using XML and HTTP. Bray extolled SOAP, explaining that its many implementations and widespread deployment were key to its importance. He emphasized the significant role SOAP will play in the future of web applications. Bray also questioned the value of the W3C's XML Protocol Activity, saying that they should have rubber-stamped SOAP and got on with things. To the amusement of the audience, Bray mused that with the enormous size of the XML Protocol working group, it might just take them 18 months to make that decision alone..."

  • [May 24, 2001] "Interviews: Not Stuck in the Woods. [CTO Insider.]" By Michael Vizard and Bob Renner. In InfoWorld Issue 21 (May 18, 2001), page 54. ['Forest Express has built a public exchange for the timber industry.] "As the CTO of Forest Express, Bob Renner is charged with creating a public marketplace that is jointly funded by investors such as International Paper, Weyerhaeuser, Georgia-Pacific, Boise Cascade, Mead, Willamette Industries, and Morgan Stanley. Forest Express recently concluded its 100th transaction after launching last winter. In an interview with InfoWorld Editor in Chief Michael Vizard, Renner talks about what it takes to build a public exchange... Renner: Forest Express is a marketplace that is used to buy and sell products [related to] the forest products industry. Currently we have four vertical markets that we're focused on ... paper, building materials, timber, and recycling... So far we have all four of our verticals up in transacting business. The platform is comprised of a couple of base technology providers. For the middleware layer, webMethods is our partner. We also use Commerce One and SAP for the marketplace. The other products that we're using are best-of-breed type products, include Moai Technologies for auctions, and Corio is our ASP [application service provider] outsource provider. [InfoWorld: How important is XML to your efforts, especially as it relates to integration?] Renner: XML is a journey and not an event. The standards that are starting to evolve are clearly going to help us get to something that makes supporting cross industry interoperability easier. But it's going to take a fair amount of work to consolidate around the standards to support this particular industry. And that'll take time. There's been some very good leadership played in Europe around XML standards for the paper industry. We've latched onto those and tried to add additional value as those standards move forward..."

  • [April 24, 2000] "Managing Documentation with XML." By Christopher R. Maden. 10-April-2000. Some slides for a brief introduction I did [on XML] for the "Documentation & Training Conference [Tyngsboro, MA]... XML is just syntax; the information analysis part of the problem is more complex. If you understand your information, you can make good use of it in Word, Frame, XML, SGML, or HTML; if you don't understand it, you're going to have problems in any syntax. There are good reasons for going with XML, but there are drawbacks in that using it (generally speaking) forces you to confront how you're thinking about your information..."

  • [May 23, 2001] "RDF and TopicMaps: An Exercise in Convergence." By Graham Moore (Vice President Research & Development, Empolis GmbH). Paper for XML Europe 2001 Berlin. 2001-05-24. ['This paper presents: (1) a way in which RDF can be used to model topicmaps and vice versa; (2) the issues that arise when performing a model to model mapping; (3) some proposals for changes to XTM to enable semantic interchange of the two standards. I am presenting this paper on Thursday at XML Europe if anyone is around and interested. I don't think this is the complete solution to the integration issue. However, I think that this paper could help focus some of the discussions.'] "There has long been a sense in the semantic web community that there is a synergy between the work of ISO and TopicMaps.org on TopicMaps and that of the W3C on RDF. This paper looks at why and how we can bring these models together to provide a harmonised platform on which to build the semantic web. The reasoning behind bringing together these two standards is in the fact that both models are intent on describing relationships between entities with identity. The question we look to answer in this paper is 'Is the nature of the relationships and the identified entities the same'. If we can show this to be true then we will be able to have a common model that can be accessed as a TopicMap or as a RDF Model. To make this clearer, if we have a knowledge tool X we would expect to be able to import some XTM syntax, some RDF syntax and then run either a RDF or TMQL query in the space of tool X and expect sensible results back across the harmonised model. In order to achieve this aim we need to show a model to model mapping between the two standards. We present the TopicMap model, the RDF model, a discussion on modelling versus mappings and then a proposed mapping between the two. As part of the mapping we make suggestions as to the changes that could be made to better support interoperation and finally we conclude and provide an insight into future work... we define a clear goal that we should be able to run a TMQL query against an RDF model and get 'expected results' i.e., those that would be gained from running a query against the equivalent TopicMap. To make this possible we need to make the models map rather than using the models to describe each other. The key difference in these approaches is that one provides a mapping that is semantic, the other uses each standard as a tool for describing other models. It is interesting that both models are flexible enough and general enough to allow each to be modelled using the other... While we found there was a useful mapping that could be performed it was felt that some additions to the TopicMap model -- Templates and Arcs would enable two way transition from RDF to TopicMaps and vice versa. We conclude that making some non-regressive enhancements to TopicMaps would enable a useful degree of convergence between TopicMaps and RDF, creating a single common semantic space in which to define the semantic web." See (1) "Resource Description Framework (RDF)", and (2) "(XML) Topic Maps." [cache]

  • [May 23, 2001] "XML Catalogs." Edited by Norman Walsh for the OASIS Entity Resolution Committee. Working Draft 24-May-2001. "In order to make optimal use of the information about an XML external resource, there needs to be some interoperable way to map the information in an XML external identifier into a URI for the desired resource. This Standard defines an entity catalog that handles two simple cases: (1) Mapping an external entity's public identifier and/or system identifier to an alternate URI. (2) Mapping the URI of a resource (a namespace name, stylesheet, image, etc.) to an alternate URI. Though it does not handle all issues that a combination of a complete entity manager and storage manager addresses, it simplifies both the use of multiple products in a great majority of cases and the task of processing documents on different systems...This Standard defines a format for an application-independent entity catalog that maps external identifiers and URIs to (other) URIs. This catalog is expressed in terms of XML 1.0 (Second Edition) and XML Namespaces. This catalog is used by an application's entity manager. This Standard does not dictate when an entity manager should access this catalog; for example, an application may attempt other mapping algorithms before or (if the catalog fails to produce a successful mapping) after accessing this catalog. For the purposes of this Standard, the term catalog refers to the logical 'mapping' information that may be physically contained in one or more catalog entry files. The catalog, therefore, is effectively an ordered list of (one or more) catalog entry files. It is up to the application to determine the ordered list of catalog entry files to be used as the logical catalog. [This Standard uses the term 'catalog entry file' to refer to one component of a logical catalog even though a catalog entry file can be any kind of storage object or entity including -- but not limited to -- a table in a database, some object referenced by a URI, or some dynamically generated set of catalog entries.] Each entry in the catalog associates a URI with information about an external reference that appears in an XML document." Document appendices include: A. A W3C XML Schema for the XML Catalog (Non-Normative); B. A TREX Grammar for the XML Catalog (Non-Normative); C. A RELAX Grammar for the XML Catalog (Non-Normative); D. A DTD for the XML Catalog (Non-Normative); E. Support for TR9401 Catalog Semantics (Non-Normative). See also the new issues list, the diff from previous version, and the OASIS TC on Entity Resolution.

  • [May 23, 2001] "The Power Of Voice." By Ana Orubeondo (Test Center Senior Analyst, Wireless and Mobile Technologies). In InfoWorld Issue 21 (May 18, 2001), pages 73-74. ['VoiceXML should connect your existing Web infrastructure, the Internet, and the standard telephone by providing a standard language for building voice applications. E-business managers who plan voice portal strategies will need to decide whether to build the portals themselves or turn to a growing number of voice ASPs. Be careful when selecting rapidly evolving voice portal technologies. Key improvements such as grammar authoring in Version 2.0 should iron out some of the shortcomings VoiceXML exhibits.'] VoiceXML is a standard language for building interfaces between voice-recognition software and Web content. Just as HTML defines the display and delivery of text and images on the Internet, VoiceXML translates any XML-tagged Web content into a format that speech-recognition software can deliver by phone. VoiceXML 1.0 is a specification of the VoiceXML Forum, an industry organization founded by AT&T, IBM, Lucent Technologies, and Motorola and consisting of more than 300 companies. With the backing and technology contributions of its four world-class founders and the support of leading Internet industry players, the VoiceXML Forum has made speech-enabled applications on the Internet a reality through its mission to develop and promote VoiceXML. With VoiceXML, users can create a new class of Web sites using audio interfaces, which are not really Web sites in the normal sense because they provide Internet access with a standard telephone. These applications make online information available to users who do not have access to a computer but do have access to a telephone. Voice applications are useful for highly mobile users who need hands-and eyes-free interaction with Web applications, possibly while driving or carrying luggage through a busy airport... Voice portals such as BeVocal, TellMe, and Shoptalk are already providing voice access to stock quotes, movie and restaurant listings, and daily news. The best-suited applications for VoiceXML are information retrieval, electronic commerce, personal services, and unified messaging. Several companies have already employed VoiceXML in information retrieval applications to great success. Hotels, car rental agencies, and airlines have implemented continuous voice access to allow customers to make or confirm reservations, buy tickets, find rates, get store hours and driving directions, and access loyalty programs. Voice automated services help reduce call-center costs and increase customer satisfaction... As the volume of information published using HTML grows and the range of Web services broadens, VoiceXML will become an increasingly attractive technology. VoiceXML increases the leverage under a company's Web investment by offering voice interpretation of HTML content." See "VoiceXML Forum." [altURL]

  • [May 22, 2001] Web Services Flow Language (WSFL 1.0). By Prof. Dr. Frank Leymann (Distinguished Engineer; Member IBM Academy of Technology, IBM Software Group). May 2001. 108 pages. [Summary: 'The Web services Flow Language guide (WSFL) describes how Web services may be composed into new Web services to support business processes. Composition comes in two types: The first type allows to specify the logic of a business process; the second type allows to define the mutual exploitation of Web services of participants in a business process. A brief concepts of composition sketch is provided in an introductory chapter of the document. A detailed discussion of the metamodel behind composition follows. The language proper is described and illustrated by code snippets, followed by an XML schema of the language.'] "The Web Services Flow Language (WSFL) is an XML language for the description of Web Services compositions. Flow Models: In the first case, a composition is created by describing how to use the functionality provided by the collection of composed Web Services. This is also known as flow composition, orchestration, or choreography of Web Services. WSFL models these compositions as specifications of the execution sequence of the functionality provided by the composed Web Services. Execution orders are specified by defining the flow of control and data between Web Services. For this reason, in this document, we will also use the term flow model to refer to the first type of Web Services compositions. Flow models can especially be used to model business processes or workflows based on Web Services... Global Models: In the second case, no specification of an execution sequence is provided. Instead, the composition provides a description of how the composed Web Services interact with each other. The interactions are modeled as links between endpoints of the Web Services interfaces, each link corresponding to the interaction of one Web Service with an operation of another Web Service's interface. Because of the decentralized or distributed nature of these interactions, we will use the term global model in this document to refer to this type of Web Services composition. Recursive Composition: WSFL provides extensive support for the recursive composition of services. In WSFL, every Web Service composition (a flow model as well as a global model) can itself become a new Web Service, and can thus be used as a component of new compositions. The ability to do recursive composition of Web Services provides scalability to the language and support for top-down progressive refinement design as well as for bottom-up aggregation. For these reasons, recursive composition has been a central requirement in the design of the WSFL language. Hierarchical and Peer-to-Peer Interaction: WSFL compositions support a broad spectrum of interaction patterns between the partners participating in a business process. In particular, both hierarchical interactions and peer-to-peer interactions between partners are supported. Hierarchical interactions are often found in more stable, long-term relationships between partners, while peer-to-peer interactions reflect relationships that are often established dynamically on a per-instance basis... The guiding principle behind WSFL is to fit naturally into the Web Services computing stack. It is layered on top of the Web Services Description Language (WSDL). WSDL describes the service endpoints where individual business operations can be accessed WSFL uses WSDL for the description of service interfaces and their protocol bindings WSFL also relies on an envisioned 'endpoint description language' to describe non-operational characteristics of service endpoints, such as quality-of-service properties. Here, we will refer to this language as the 'Web Services Endpoint Language' (WSEL)..." Section 5 'Appendix A: WSFL Schema' features the W3C XML schema for WSFL [2000/10 schema version]. Note: "The Web Services Flow Language is the result of a team effort: Francisco Curbera, Frank Leymann, Dieter Roller, and Marc-Thomas Schmidt created the language and its underlying concepts. Matthias Kloppmann and Frank Skrzypczak focused on its lifecycle aspects. Francis Parr worked on details of the example in the appendix. Many others helped by reviewing and discussing earlier versions of the document, most notably Sanjiva Weerawarana and Claudia Zentner." See the recent announcement for IBM's Web Services infrastructure plans. [cache]

  • [May 22, 2001] "[Schema Algebra.] Defining Logical Relationships Between Documents, Schemata, URIs, Resources, and Entities." By Jonathan Borden (The Open Healthcare Group). May 4, 2001 or later. "This paper forms the foundation for a schema independent type framework. The relationship between URIs, Resources and Entities are formally defined. XML Namespaces are defined using tuples. We define a schema generically through a validity predicate. This predicate tests an instance with respect to a schema. This predicate serves to define the set of Instances of a particular schema..." [Citation context: see reference in XML-DEV posting "Types and Context," 21-May-2001: 'In the Schema Algebra, statements [7-9] a "type" is the property of belonging to a class. The predicate "typeOf(x, c)" tests a node "x" for membership in the instance set of the class "c"...']

  • [May 22, 2001] "A Generic Fragment Identifier Syntax." By [Jonathan Borden]. 2001-05-14 (or later). "Frequently URI references, which may contain a fragment identifiers, are used independent of their resolution into a particular document, or document fragment, at a particular point in time. A notable example is use of a URI reference as an XML Namespace name. In the current situation a the syntax of the fragment identifier part of a URI reference is defined by the MIME media type of the referenced document as in an HTTP transaction. This media type is not fixed, and may change from time to time and from reference to reference, or according to request headers such as with content negotiation. It turns out that the fragment identifier syntax is often constant from media type to media type. In order to enable robust use of fragment identifiers, particularly outside a particular HTTP transaction, we propose a generic, media type independent, fragment identifier syntax. This fragment identifier syntax is compatible with current usage of fragment identifiers, and is generally compatible with future proposed syntaxes such as XPointer. This specification does not itself specify how user agents are to process or interpret fragment identifiers, such as may be specified with individual MIME media type registrations, rather provides a consistent syntax for fragment identifiers and a registration mechanism for schemes associated with fragment identifier syntaxes..."

  • [May 22, 2001] "Augmenting UML with Fact-orientation." By Terry Halpin. Published in the Proceedings of the Hawai'i International Conference on System Sciences 2001, Section "Unified Modeling Language: A Critical Review and Suggested Future," [HICSS-34], January 3-6, 2001, Outrigger Wailea Resort, Island of Maui. "The Unified Modeling Language (UML) is more useful for object-oriented code design than conceptual information analysis. Its process-centric use-cases provide an inadequate basis for specifying class diagrams, and its graphical language is incomplete, inconsistent and unnecessarily complex. For example, multiplicity constraints on n-ary associations are problematic, the constraint primitives are weak and unorthogonal, and the graphical language impedes verbalization and multiple instantiation for model validation. This paper shows how to compensate for these defects by augmenting UML with concepts and techniques from the Object Role Modeling (ORM) approach. It exploits 'data use cases' to seed the data model, using verbalization of facts and rules with positive and negative examples to facilitate validation of business rules, and compares rule visualizations in UML and ORM. Three possible approaches are suggested: use ORM for conceptual analysis then map to UML; supplement UML with population diagrams and user-defined constraints; enhance the UML metamodel... The UML notation includes the following kinds of diagram for modeling different perspectives of an application: use case diagrams, class diagrams, object diagrams, statecharts, activity diagrams, sequence diagrams, collaboration diagrams, component diagrams and deployment diagrams. This paper focuses on conceptual data modeling, so considers only the static structure (class and object) diagrams. Class diagrams are used for the data model, and object diagrams for data populations. Although not yet widely used for designing database applications, UML class diagrams effectively provide an extended Entity-Relationship (ER) notation that can be annotated with database constructs (e.g., key declarations)... This paper identifies several weaknesses in the UML graphical language and discusses how fact-orientation can augment the object-oriented approach of UML. It shows how verbalization of facts and rules, with positive and negative examples, facilitates validation of business rules, and compares rule visualizations in UML and ORM on the basis of specified modeling language criteria... [Conclusion:] Fact-orientation, as exemplified by ORM, provides many advantages for conceptual data analysis, including expressibility, validation by verbalization and population at both fact and constraint levels, and semantic stability (e.g., avoiding changes caused by attributes evolving into associations). ORM also has a mature formal foundation that may be used to refine the semantics of UML. Object-orientation, as exemplified by UML, provides several advantages such as compactness, and the ability to drill down to detailed implementation levels for object-oriented code. If UML is to be used for conceptual analysis of data, some ORM features can be adapted for use in UML either as heuristic procedures or as reasonably straightforward extensions to the UML metamodel and syntax. These include mixfix verbalizations of associations and constraints for associations, and exploitation of data use cases by populating associations with tables of sample data using role names for the column headers. However there are some fundamental aspects that need drastic surgery to the semantics and syntax of UML if it is ever to cater adequately for non-binary associations and some commonly encountered business rules. This paper revealed some serious problems with multiplicity constraints on n-ary associations, especially concerning non-zero minimum multiplicities. For example, they cannot be used in general to capture mandatory and minimum occurrence frequency constraints on even single roles within n-aries, much less role combinations. Moreover, UML's treatment of set-comparison constraints is defective. Although it is possible to fix these problems by changing UML's metamodel to be closer to ORM's, such a drastic change to the metamodel may well be ruled out for pragmatic reasons (e.g., maintaining backward compatibility and getting the changes approved). In contrast to UML, ORM has only a small set of orthogonal concepts that are easily mastered. UML modelers willing to learn ORM can get the best of both approaches by using ORM as a front-end to their data analysis and then mapping the ORM models to UML, where the additional constraints can be captured in notes or textual constraints. Automatic transformation between ORM and UML is feasible, and is currently being researched." See: "Conceptual Modeling and Markup Languages." [alt URL, cache]

  • [May 22, 2001] "Chatting in Financial Messages." By Dmeetry Raizman (Maximel.com). May 2001. "IT people are still striving to bring their organizations to the promised land of straight through processing. But, according to this article, real-time chatting takes this idea one step beyond. In the business world today, most electronic messaging is asynchronous - that is, it goes one direction at a time, rather like an old-time telegraph system. Thus, while the transfer of the message itself can be quick, one system cannot talk to another in real-time. It must send a message and then wait for a response before speaking again. Now XML and Java are changing all that. Particularly in the case of financial messages, it will be possible for systems to 'chat' in real time - that is, to speak to one another in much the way that people now converse on the phone or in a group. Presenting debatable approach to Straight Through Processing (STP), this article exposes the convergence point where the combined strength of Java and XML turns toward the major trend of the financial industry - enabling the metamorphosis of conventional STP to RTC - Real Time Chatting between involved parties. Indeed, technology contributes to the compression of the securities trade settlement cycle from the existing T+3 to T+1 enabling the upcoming drift of the financial industry..." See also: "Straight Through Processing Markup Language (STPML)."

  • [May 21, 2001] "Leveraging the Business Analyst: Object Role Modeling with Visual Studio.NET." From Microsoft. 2001-05-21. "Object role modeling (ORM) provides a conceptual, easy-to-understand method of modeling data. The ORM methodology is based on three core principles: (1) Simplicity. Data is modeled in the most elementary form possible. (2) Communicability. Database structures are documented by using language that can be easily understood by everyone. (3) Accuracy. A correctly normalized schema is created based on the data model. Typically, a modeler develops an information model by gathering requirements from people who are familiar with the application but are not skilled data modelers. The modeler must be able to communicate data structures at a conceptual level in terms that the non-technical business expert can understand. The modeler must also analyze the information in simple units and work with sample populations. ORM is specifically designed to improve this kind of communication. Rules Expression: ORM represents the application world as a set of objects (entities or values) that play roles (parts in relationships). ORM is sometimes called fact-based modeling, because it verbalizes the relevant data as elementary facts. These facts can't be split into smaller facts without losing information... Where ORM describes business facts in terms of simple objects and predicates, entity relationship methodologies describe the world in terms of entities that have attributes and participate in relationships... ORM not only provides a simple, direct way of describing relationships between different objects. From the example, we see that ORM also provides flexibility. Models created using ORM have a greater capacity to adapt to changes in the system than those created using other methodologies. In addition, ORM allows non-technical business experts to talk about the model in terms of sample populations, so they can validate the model using real-world data. Because ORM allows reuse of objects, the data model automatically maps to a correctly normalized database schema. The simplicity of the ORM model eases the database querying process. Using an ORM query tool, the user can access the desired data without having to understand the underlying structure of the database... Like any good modeling method, ORM is more than just a notation. It includes various design procedures to help modelers map conceptual and logical models, and to use reverse engineering to switch between those models. ORM models can also be automatically mapped to database schemas for implementation on most popular relational databases..." See (1) the announcement, Microsoft Unveils Visual Studio.NET Enterprise Tools. Visual Studio.NET Enterprise Architect and Enterprise Developer to Lead Corporations Into New Age of XML Web Services", and (2) "Visual Studio.NET Enterprise Features." Visual Studio.NET is described as supporting Testing XML Web Services and Applications."

  • [May 21, 2001] "Decryption Transform for XML Signature." 10-May-2001. W3C draft document edited by Takeshi Imamura Hiroshi Maruyama as part of the W3C XML Encryption Working Group activity. See the note from Joseph Reagle Jr. "This document specifies the 'decryption transform', which enables XML Signatures verification even if both signature and encryption operations are performed on an XML document." Status: "This is a proposal being staged for publication and has (as of yet [2001-05-21]) no W3C status or standing." Excerpt: "Since encryption operations applied to part of the signed content after a signature operation cause a signature not to be verifiable, it is necessary to decrypt the portions encrypted after signing before the signature is verified. The 'decryption transform' proposed in this document provides a mechanism; decrypting only signed-then-encrypted portions (and ignoring encrypted-then-signed ones). A signer can insert this transform in a transform sequence (e.g., before Canonical XML or XPath) if there is a possibility that someone will encrypt portions of the signature. The transform defined in this document is intended to propose a resolution to the decryption/verification ordering issue within signed resources. It is out of scope of this document to deal with the cases where the ordering can be derived from the context. For example, when a ds:DigestValue element or a (part of) ds:SignedInfo element is encrypted, the ordering is obvious (without decryption, signature verification is not possible) and there is no need to introduce a new transform..." See: (1) XML Encryption Working Group, and (2) "XML and Encryption."

  • [May 21, 2001] "XML in Enterprise Information Systems." By David Jackson (DSTC, Brisbane, Australia). Version: May 2001. ['Some of you might be interested in this document on the use of XML in enterprise information sytems... There are two formats - (1) one big file and (2) a multi-file version. Links to both are found at http://www.dstc.edu.au/Jackson/system/. This is the second public draft of this document (first draft last November). It is a rather high level overview of a large topic and there is still a long way to go. Some sections of it need a lot of work. The section on exchange is especially undeveloped, I think in part due to the fact that this area of exchange and XML is still very much in flux. The document gives few if any answers but asks a lot of questions. There are many gaps in the content reflecting gaps in my understanding. No doubt some of what I've written also reveals gaps in my understanding. I would be glad to have any feedback, ideas or contributions. All contributions will of course be acknowledged. I would also be happy to include references to related work.'] From the initial section: "Building useful systems and knowledge structures in the enterprise is merely difficult. Still, in an enterprise, as opposed to either of the above alternatives, the knowledge domains are limited, and there is some chance of control, however small, over the data and document structures and work practices of the people who work there. Using XML in information systems is still quite new, which means that, except in the simplest cases, we are not even sure of all the problems to be solved in systems which use XML. Even less do we know how best to solve those problems. We may not even be sure what kinds of systems we want to build with XML, or of the kinds of systems that XML makes possible. In other words, new kinds of systems are possible, not just new kinds of technology to build the systems we already know about. To come to grips with the new possibilities will require a period of learning and gaining experience... Should we think of XML from a systems perspective, and focus on the benefits of improved IT architectures? Or is XML about information, requiring the development of improved business, information and data models before its benefits can be fully realised?... Enterprises are interested in internal sharing of their knowledge in whatever form, and this means developing the knowledge formats and information systems to do so. The real problem is not agreeing on what we want. After all, wishing is cheap, and who would wish for part when for the same low price they could wish for all? Much harder is the design and implementation of the systems, and designing and using the information resources. However, when it comes to fitting XML into this picture we often do not really know what we are doing, beyond some simple cases, generally involving data exchange between applications..."

  • [May 21, 2001] "XML: Deriving Applications from Information Web Publishing with XML Part II. [Tools of the Trade.]" By Wes Biggs. In Online Journalism Review (May 17, 2001). ['To fully capture the value of XML-based data architectures, we need software applications that act on XML data to do something useful, like publish a Web site.'] If a given XML document in our system represents a single article or story, the most straightforward application for that document would be one that builds a Web page from it that can be displayed by a browser. While modern browsers like Internet Explorer 5.5 claim to be 'XML-enabled', an XML document does not detail how it should be displayed in the same way a Web-native HTML document does. It's possible to view a straight XML document in the browser, but it looks just about as awful as the text-only view. In order to build a useful Web site in real life, we need to prescribe a uniform method of translating our XML documents to HTML. Database-driven applications typically use a template file to describe the breakdown between static and dynamic content on a page. The most popular technique for translating XML to HTML takes the same basic approach, utilizing a "stylesheet" as the template. Stylesheets look a bit like HTML files, but include specialized tags that describe where to place data found in an XML document. An XSL (XML Stylesheet Language) processor is a generic software application that takes an XML document (data) and an XSL stylesheet (rules for transforming data), and generates another document. Many software programmers swear by an architecture called MVC, for Model/View/Controller. XSL's application to XML documents follows this design: an XML document models the data; the XSL processor and stylesheet serve as the controller, and the view is the generated document created by this process. By changing the controller, we can change the browser view without having to change the model XML document... To summarize this article, XML is an excellent concept for providing a layer of abstraction between the editorial version of your content and the form seen by the reader. Technologies like XSL can ease the pain of distributing to multiple platforms and varying devices, and can make site redesigns a less painful experience all around. And knowledgeable techies in almost any environment can find resources to put together and integrate an XML system or subsystem for a Web site." See also Part 1.

  • [May 21, 2001] "Web Publishing with XML. Part I: Defining the definitions." By Wes Biggs. In Online Journalism Review (May 01, 2001). "Online media conferences are rife with talk of XML. Industry pundits proclaim how well it slices, dices and tenderizes your cherished Web sites. The term adorns headlines in all the weekly rags on the boss's desk, but no one can figure out how to translate the gloss into something of substance for your online presence. It seems to be everything to everyone -- but what is it? In this series, we'll take a look at some practical applications of XML-based technologies that have the potential, if executed correctly, to simplify the process of bringing content online... a number of groups have taken on the work of creating XML tag languages that describe news media documents. The frontrunners are two standards proposed by the International Press Telecommunications Council: NITF, the News Industry Text Format, and the medium-agnostic NewsML. In addition to the IPTC's own push, media organizations like the American Press Institute have officially recommended NITF in place of their own competing proposals, which included other acronym-heavy standards like NMF and XMLNews... While XML is not a replacement for the data integrity and searching capabilities of databases, it provides a means of conceptualizing data that is easily understood by software, that doesn't have to know the proprietary details about a company's mainframe or the way the database engine operates. Another benefit of the use of an XML standard like NITF is data interchange. If you have software tools that know how to process a NITF-formatted document, it doesn't matter whether the document itself came from your in-house reporting staff, your database, or straight off a wire service that happened to deliver its feed in NITF form. And even if NITF isn't the native language of a given service, news aggregator sites like ScreamingMedia can help bridge the gap by turning all kinds of source feeds into compliant NITF documents... With XML, you can automate the population of templates in the same way a mail merge program combines a form letter with a list of recipients. One of the most straightforward returns on an XML investment is often a reduction in the amount of time and manual labor it takes to ready an article for Web publication..."

  • [May 21, 2001] "The Model Of Object Primitives: Representation of Object Structures based on State Primitives and Behaviour Policies." By Nektarios Georgalas (BT Adastral Park, B54/Rm125, Martlesham Heath, IPSWICH, IP5 3RE). In Succeeding with Object Databases: A Practical Look at Today's Implementations with Java and XML, edited by Roberto Zicari and Akmal Chaudhri. John Willey and Sons Publishers, 2000. ISBN 0471383848. "In contemporary business environments, different problems and a variety of diverse requirements compel designers to adopt numerous modelling methodologies that use semantics customised to suit ad-hoc needs. This fact hinders the unanimous acceptance of one modelling paradigm and lays the ground for the adoption of customised versions of some. Based on this principle, we devised and present the Model of Object Primitives which aims at providing a minimum as well as generic set of semantics without compromising expressive capability. It is a class-based model that accommodates the representation of static and dynamic characteristics, i.e., state and behaviour, of objects acting within the problem domain. It uses classes and policies to represent object state and behaviour and collections to collate them into complex structures. In this paper we introduce MOP and provide an insight into its semantics. We examine three case studies that use MOP to represent the XML, ODMG and Relational data-models and also schemata which are defined in these models. Subsequently, another two case studies illustrate practically how MOP can be used in distributed software environments to customise the behaviour or construct new components based on the powerful tool of behaviour policies... MOP, the Model of Object Primitives, is a class-based model that aims at analysing and modelling objects using their primitive constituents, i.e., state and behaviour. MOP contributes major advantages: (1) Minimal and rich. The semantics set includes only five basic representation mechanisms, namely, state, behaviour, collection, relationship and constraint. These suffice for the construction of highly complex schemata. MOP, additionally, incorporates policies through which it can express dynamic behaviour characteristics. (2) Cumulative and expandable concepts. The aggregation mechanism introduced by the Collection Class allows for the specification of classes that can incrementally expand. Since a class can participate in many collections, we can create Collections where one contains the other such that a complex structure is gradually produced including state as well as behaviour constituents. (3) Reusable concepts. A MOPClass can be included into more than one Collections. Therefore, concepts modelled in MOP are reusable. As such, any behaviour that is modelled as a combination of a Behaviour Class and a MOP Policy can be reusable. This provides for the usage of MOP in modelling software components. Reusability is a principal feature of components. (4) Extensible and customisable. MOP can be extended to support more semantics. Associating its main modelling constructs with constraints, more specialised representation mechanisms can be produced. (5) Use of graphs to represent MOP schemata and policies. MOP classes and relationships within a schema could potentially be visualised as nodes and edges of a graph. MOP policies are described to be graph-based as well. This provides for the development of CASE tools, similar to MOPper, which alleviate the design of MOP-based models. It is our belief that MOP can play a primary role in the integration of heterogeneous information systems both conceptually and practically. Conceptually, MOP performs as the connecting means for a variety of information models and can facilitate transformations among them. It was not within the paper's scope to study a formal framework for model transformations. However, the XML, ODMG and Relational data-model case studies give good evidence that MOP can be efficiently used to represent semantics of diverse modelling languages. This is a necessary condition before moving onto studying model transformations. Practically, MOP provides effective mechanisms to manage resources within a distributed software environment. Both practical case studies presented above show that MOP can assist in the construction of new components or in customising the behaviour of existing components. This is because MOP aids the representation and, therefore, the manipulation of context resources, state or behaviour, in a primitive form. Moreover, the adoption of policies as the means to describe the dynamic aspects of component behaviour, enhances MOP's role. Consequently, it is our overall conclusion that the Model of Object Primitives constitutes a useful paradigm capable of delivering efficient solutions in the worlds of data modelling and distributed information systems." See especially Section 4.1, "XML in MOP." Related references: "Conceptual Modeling and Markup Languages."

  • [May 19, 2001] "Standardizing XML Rules." By Benjamin N. Grosof (MIT Sloan School of Management, Cambridge, MA, USA. Email: bgrosof@mit.edu or grosof@cs.stanford.edu). Invited paper for the IJCAI 2001 Workshop on E-Business and the Intelligent Web [August 5 2001], part of the Seventeenth International Joint Conference on Artificial Intelligence. ['The author provides an overview of current efforts to standardize rules knowledge representation in XML, with special focus on the design approach and criteria of RuleML, an emerging standard. With Harold Boley of DFKI (Germany) and Said Tabet of Nisus Inc. (USA), Benjamin N. Grosof leads an early-phase standards effort on a markup language for exchange of rules in XML, called RuleML (Rule Markup Language); the goal of this effort is eventual adoption as a Web standard, e.g., via the World Wide Web Consortium'] "RuleML is, at its heart, an XML syntax for rule knowledge representation (KR), that is inter-operable among major commercial rule systems. It is especially oriented towards four commercially important families of rule systems: SQL (relational database), Prolog, production rules (cf. OPS5, CLIPS, Jess) and Event-Condition-Action rules (ECA). These kinds of rules today are especially found embedded in Object-Oriented (OO) systems, and are often used for business process connectors / workflow. These four families of rule systems all have common core abstraction: declarative logic programs (LP). 'Declarative' here means in the sense of KR theory. Note that this supports both backward inferencing and forward inferencing. RuleML is actually a family (lattice) of rule KR expressive classes: each with a DTD (syntax) and an associated KR semantics (KRsem). These expressive classes form a generalization hierarchy (lattice). The KRsem specifies what set of conclusions are sanctioned for any given set of premises. Being able to define an XML syntax is relatively straightforward. Crucial is the semantics (KRsem) and the choice of expressive features. The motivation to have syntax for several different expressive classes, rather than for one most general expressive class, is that: precision facilitates and maximizes effective interoperability, given heterogeneity of the rule systems/applications that are exchanging rules. The kernel representation in RuleML is: Horn declarative logic programs. Extensions to this representation are defined for several additional expressive features: (1) negation: negation-as-failure and classical negation; (2) prioritized conflict handling: e.g., cf. courteous logic programs; (3) disciplined procedural attachments for queries and actions: e.g., cf. situated logic programs; (4) equivalences, equations, and rewriting; (5) and other features as well. In addition, RuleML defines some useful expressive restrictions (e.g., Datalog, facts-only, binary-relations-only), not only expressive generalizations... In January 2001, we released a first public version of a family of DTDs for several flavors of rules in RuleML. This was presented at the W3C's Technical Plenary Meeting held February 26 to March 2, 2001. Especially since then, RuleML has attracted a considerable degree of interest in the R&D community. Meanwhile, the design has been evolving to further versions." See: "Rule Markup Language (RuleML)." [cache]

  • [May 18, 2001] "ebXML: It Ain't Over 'til it's Over." By Alan Kotok. From XML.com. May 16, 2001. ['The final meeting of the Electronic Business XML initiative in Vienna marked the 18-month deadline set for the project, yet there is still plenty left to do.'] "At the Electronic Business XML (ebXML) meeting in Vienna, Austria, 7-11 May 2001, the 150 participants approved the specifications and technical reports defining the ebXML technical architecture. The group also held its most complete proof-of-concept demonstrations at the midpoint of the meeting. But the session ended with ebXML's most promising features, interoperable business semantics, still incomplete. This meeting marked the end of ebXML's 18-month self-imposed deadline that began in November 1999, and the topic of ebXML's future direction took up much of the participants' time and energy. Until this meeting, the ebXML leadership put off any serious discussion of its post-Vienna future. In Vienna, participants got their first chance to see ebXML's new incarnation. EbXML is a joint initiative of Organization for the Advancement of Structured Information Standards (OASIS) and the UN's Centre for Trade Facilitation and Electronic Business (UN/CEFACT). Its goal is to develop a set of specifications to allow any business of any size in any industry to do business with any other entity in any other industry anywhere in the world. The group's work has focused particularly on making e-business possible for smaller companies, generally left out of electronic data interchange (EDI) in the past. In this partnership, OASIS brings XML knowledge and experience, while UN/CEFACT, the group that developed and manages the UN/EDIFACT EDI standard, offers the business expertise... At the opening general session, Ray Walker, of UN/CEFACT and one of the ebXML executives, said his organization and OASIS would divide up management of the technical teams, where UN/CEFACT would continue the work on business content and OASIS would handle further work on infrastructure. The groups would create a coordination committee to jointly publish the approved specifications, with further details released during the week...OASIS and UN/CEFACT, according to the new agreement, will jointly publish the ebXML documents, including specifications, technical reports, white papers, and reference documents. In response to an audience question, Walker said that the ebXML site would continue 'for the moment' to provide one source for all ebXML documentation. The Vienna meeting also provided a discussion of future implementation strategies for ebXML. At a briefing for visitors to the meeting, Jon Bosak of Sun Microsystems, and former co-chair of the W3C's XML working group, laid out a three-stage process for businesses to implement ebXML. (1) Standard infrastructure: ...as a result of its current work, ebXML can immediately offer a package of standard message structures, registries and repositories, company e-business profiles, and trading partner agreements. It would allow even the smallest businesses to send ebXML-compliant messages by e-mail. Larger companies can also take part in ebXML when it suits their purposes, for example, as a complement to their EDI transactions. Registries can start providing the message specifications, industry vocabularies, and profiles of potential trading partners. (2) Standard electronic messages: provide standardized messages defined by individuals or organizations. The standard messages would encourage the development of off-the-shelf software solutions and begin the process of replacing paper documents with electronic counterparts. By 2003, Bosak expects repositories to store and registries to index business process models and standard messages, with the models using UML, DTDs, or prose representations. (3) Single standard semantic framework: a standard electronic semantic framework would automatically generate standard schemas and messages. Business models would represent complete top-down analysis and allow for dynamic modification as new business relationships emerged..." See the Vienna announcement and the main entry, "Electronic Business XML Initiative (ebXML)."

  • [May 18, 2001] "XML Technologies: A Success Story." By J. David Eisenberg. From XML.com. May 16, 2001. ['XML's not just about big business. Read how XML technologies XSL-FO and SVG helped improve this year's California Central Coast Section High School wrestling tournament.'] "We've all heard stories of how new XML technologies have helped build immense corporate databases and complex, dynamic web sites. Well, this isn't one of those stories. This story is about how the Apache Software Foundation's XML tools helped improve this year's California Central Coast Section High School wrestling tournament... Since I'm using Linux and the CCS uses Windows, I needed a cross-platform solution. Adobe PDF format was the answer, and this is where Scalable Vector Graphics (SVG) and Formatting Objects to PDF (FOP) enter the story. Creating the Bout Sheet The bout sheet is not a typical text document; it's mostly a set of lines, empty boxes and a circle with minimal text labeling. Thus, I decided to use Scalable Vector Graphics (SVG) to describe the form, and use FOP as a wrapper to produce the desired PDF output. I took a ruler and an old bout sheet, redrew the lines, and measured the widths and locations of the boxes and text, and created the formatting objects XML file by hand...So after all that work and trouble, was it worth it? Yes. It took me less time to produce the bout sheet with SVG than it would have taken to find a Windows machine, learn to use a drawing program, and produce a file that would have been in a proprietary format. The bracket printout was also worthwhile, mostly as a learning exercise and also as a proof of concept. The PDF output also looks better than the RTF. Again, there was a time savings; it was easier for me to learn the syntax for formatting objects than it would have been for me to learn the RTF to produce an equally good-looking result in that format. Finally, the fact that I was able to accomplish all of these tasks with open source software is the icing on the cake." See: "W3C Scalable Vector Graphics (SVG)."

  • [May 18, 2001] "Perl XML Quickstart: The Standard XML Interfaces. [Tutorial.]" By Kip Hampton. From XML.com. May 16, 2001. "This is the second part in a series of articles meant to quickly introduce some of the more popular Perl XML modules. This month we look at the Perl implementations of the standard XML APIs: The Document Object Model, The XPath language, and the Simple API for XML. As stated in part one, this series is not concerned with comparing the relative merits of the various XML modules. My only goal is to provide enough sample code to help you decide for yourself which module or approach is most appropriate for your situation by showing you how to achieve the same result with each module given two simple tasks. Those tasks are 1) extracting data from an XML document and 2) producing an XML document from a Perl hash... Up to this point each module we've looked at shares the common goal of providing a generic interface to the contents any well-formed XML document. Next month we will depart from this pattern a bit by exploring some of the modules that, while perhaps less generically useful, seek to simplify the execution of some specific XML-related task..." See: "XML and Perl."

  • [May 17, 2001] "Working with XML: The Java API for XML Parsing (JAXP) Tutorial." By Eric Armstrong. Updates for May 16, 2001. "XML Tutorial Update: The XML overview section is now complete. In particular, the descriptions of the XML standards intiatives and the JAXP APIs have been rewritten, and are worth a cursory review. The rewritten pages include (1) 'XML and Related Specs: Digesting the Alphabet Soup' and (2) 'An Overview of the APIs'. See also the main web site for Java APIs for XML Processing (JAXP), and local references in "Java API for XML Parsing (JAXP)."

  • [May 16, 2001] "The Hype Stuff. [Web Technology: XML.]" By Scott Berinato. In CIO Magazine (May 15, 2001). ['Will XML be the ultimate platform? Or will it be the next EDI? Discover how companies are using XML to create business solutions today. Learn what CIOs must do to maintain the value and openness of XML.'] "'I hear it's going to cure cancer,' says Tim Bray, XML's cocreator. 'It's going to do my dishes, I hear,' says Anne-Marie Keane, Staples' vice president of B2B e-commerce. Behind the flip jokes lies XML -- a syntax that underpins a growing list of more than 300 nascent data standards. MathML, for instance, will make it possible to manipulate advanced mathematical equations on a webpage. Spacecraft Markup Language standardizes databases that operate telemetry and mission control. And then there's MeatXML, a comical name for a serious effort to create a universal meat and poultry supply chain standard. With XML going in so many directions at once, you can't blame CIOs for being confused. The hyperbole often makes XML sound like a salve for all pain. Even worse, the vendor hype is overwhelming. CommerceOne, for example, boasts that British Telecommunications will cut purchase order processing costs by 90 percent using XML-based procurement. Software and service provider JetForm claims developers can write programs in days that would have taken months without XML. Finding the truth behind the tales takes some digging. Technologically, XML is a giant leap for IT. It can drastically reduce development time while making data transfer over the Internet simple. If nurtured properly, it may even become the ASCII text of online business -- ubiquitous and assumed. Or it could become the next EDI, fractured under the pressure of vendor self-interest. One thing is certain: For XML to reach its full potential, CIOs will have to take an active role in forcing their partners, their vendors and even their competitors toward a radically more open computing model than what exists today... XML to work, each in one of the three areas most agree the technology will first permeate: (1) Business-to-business data sharing, where Alistair Duncan of Visa International has built his own XML vocabulary for sharing corporate credit card information. [Visa XML Invoice Specification] (2) Content management, where Gary Guilland of Safeco is using XML but is hardly ready to coronate it as the future of computing. [XFA - XML forms] (3) Application-to-application integration, where Steve Morin of TAC Worldwide sees XML revolutionizing the human resources industry -- if 90 competing vendors can agree to cooperate. [HR-XML]."

  • [May 16, 2001] "Export a Word Document to XML." By Kevin McDowell (Microsoft Corporation). From MSDN Online Library. May 2001. ['This solution allows you to export a Word document to an XML file. Microsoft Word 2000.'] "Converting any data to XML requires parsing the data and tagging it with descriptors. Within a Word document, text and hyperlinks already tagged by their formatting. Most documents contain multiple structural elements, such as headings, bylines, footnotes, and quotations. All types of formatting can be applied to indicate what the elements are. For example, most headings are not the same size, weight, or even font as paragraph level text. Within a Word document, you alter text by one of two methods: by applying a style or by applying formatting manually. A style in Word is nothing more than a named set of specific instructions describing the formatting to apply. When you apply a style, you are basically tagging that text as something: a heading, a subheading, a code block, a quotation, or some other document element. When you apply formatting manually, you are tagging that text as something special, but that something is not defined. If you were to attempt to parse the document by formatting, you would know how the text appears in the document, but you wouldn't know what the text is. However, if you only apply formatting using styles, when you parse the document, not only do you know how the text appears in the document, but also you have a style name to describe what the text is. Creating a document in this manner requires that you know what your formatting represents. Instead of making text bold for emphasis, you apply a style that not only bolds the text but is descriptive of why the text is bold to begin with... After you author a document by using styles and then convert it to XML, it becomes a queryable data source. If you have a folder of XML documents, it is essentially a database. Using the FileSystemObject object in the Microsoft Scripting Runtime object model to loop through all of the files in the folder, you could apply an Extensible Stylesheet Language (XSL) query to pull out all of the headings, author information, quotations, or whatever you want, from each of the XML articles. Conclusion: This solution provides a starting point to build an XML parser for Word documents. In addition to the XML functionality, it discusses how to build custom objects to handle sequential instances of all styles and graphics and how to loop through tables and lists. Remember, documents shouldn't be converted to XML merely for the sake putting them in XML. The best document to convert to XML is one that makes use of styles and will be reused in other ways." Available online: sample download.

  • [May 16, 2001] XML for Analysis SDK. From MSDN. ['Download the SDK that provides for universal data access to analytical data sources residing over the Web, without the need to deploy a client component that exposes COM interfaces.'] "XML for Analysis SDK: msxainst.exe is a self-extracting download that contains the Microsoft XML for Analysis provider and sample client applications. The Microsoft XML for Analysis Provider supports data access to analytical data sources (OLAP and data mining) residing on the Web. This provider implements the XML for Analysis Specification, which provides for universal data access to analytical data sources residing over the Web, without the need to deploy a client component that exposes COM interfaces. The Microsoft Analysis Services server can be accessed with the provided download, from the web, without any COM components on the client..." See references in "XML for Analysis."

  • [May 16, 2001] "A simple SOAP client. A general-purpose Java SOAP client." By Bob DuCharme (VP of Corporate Documentation, UDICo). From IBM developerWorks. May 2001. ['Bob DuCharme introduces a simple, general purpose SOAP client (in Java) that uses no specialized SOAP libraries.'] "This article describes a simple, general purpose SOAP client in Java that uses no specialized SOAP libraries. Instead of creating the SOAP request XML document for you under the hood, this client lets you create your own request with any XML editor (or text editor). Instead of merely giving you the remote method's return values, the client shows you the actual SOAP response XML document. The short Java program shows exactly what SOAP is all about: opening up an HTTP connection, sending the appropriate XML to invoke a remote method, and then reading the XML response returned by the server. SOAP, the Simple Object Access Protocol, is an evolving W3C standard developed by representatives of IBM, Microsoft, DevelopMentor, and UserLand Software for the exchange of information over a network. As more SOAP servers become publicly available on the Web, SOAP is doing for programs written in nearly any language -- even short little programs written in popular, simple languages like Visual Basic, JavaScript, and perl -- what HTML does for Web browsers: It gives them a simple way to take advantage of an increasing number of information sources becoming available over the Web. Like HTML, SOAP provides a set of tags to indicate the roles of different pieces of information being sent over the Web using the HTTP transport protocol (and since SOAP 1.1, SMTP as well). SOAP, however, gives you much more power than HTML. With SOAP, your program sends a 'SOAP request' (a short XML document that describes a method to invoke on a remote machine and any parameters to pass to it) to a SOAP server. The SOAP server will try to execute that method with those parameters and send a SOAP response back to your program. The response is either the result of the execution or the appropriate error message. Public SOAP servers are available to provide stock prices, the latest currency conversion rates, FedEx package tracking information, solutions to algebraic expressions, and all kinds of information to any SOAP client that asks. Before SOAP existed, programs trying to use this kind of information had to pull down Web pages and 'scrape' the HTML to look for the appropriate text. A visual redesign of those Web pages (for example, putting the current stock price in a table's third column instead of its second column) was all it took to render these programs useless. The SOAP spec, along with its brief accompanying schemas for SOAP requests and responses, provides the framework for a contract between clients and servers that creates a foundation for much more robust information-gathering tools. There are plenty of SOAP clients available for most popular programming languages..." Article also available in PDF format. See "Simple Object Access Protocol (SOAP)."

  • [May 16, 2001] "Groups get approval for ebXML specifications. More than 200 IT organizations, companies, and software vendors vote to approve standard." By Margret Johnston. In InfoWorld (May 15, 2001). "The Organization for the Advancement of Structured Information Standards (OASIS), a nonprofit, international consortium that creates interoperable industry specifications based on public standards, and the United Nations Center for Trade Facilitation and Electronic Business (UNCEFACT), announced the news Monday following last week's meeting, which took place in Vienna. EbXML is a modular suite of specifications designed to enable companies of any size and in any country to conduct business over the Internet through the exchange of XML-based messages. It is aimed at facilitating global trade by enabling XML to be used in a consistent manner to exchange business data electronically... OASIS and UNCEFACT joined forces in September 1999 and since then have been working to identify the technical basis on which the global implementation of XML could be standardized. The groups held proof-of-concept demonstrations in several cities around the world, and the Vienna meeting marked the culmination of that effort. The suite of specifications approved in Vienna are the ebXML Technical Architecture, Business Process Specification Schema, Registry Information Model, Registry Services, ebXML Requirements, Message Service, and Collaboration-Protocol Profile and Agreement. Implementations of ebXML already are being announced, and the rate of deployment is expected to accelerate, said Patrick Gannon, chairman of the OASIS board of directors. Gannon cited recent announcements of ebXML integration and support from industry groups, including RosettaNet, a consortium of more than 400 IT and electronics companies. RosettaNet plans to integrate support for the ebXML Message Service specification in future releases of RosettaNet's Implementation Framework, the consortium announced in April. The Global Commerce Initiative, which represents manufacturers and retailers of consumer goods, also has decided to base its new Internet protocol standard for trading exchanges and business-to-business communications on ebXML, Gannon said..." See the announcement.

  • [May 15, 2001] "Next-generation e-biz." By James R. Borck [Test Center Managing Analyst]. In InfoWorld (May 11, 2001). ['Web-services-oriented architectures are gearing up to inspire e-business efficiency and dynamic partner integration.'] "... By the second half of 2002, Web services will emerge as the definitive standard for the next phase of global e-business. In the interim, your company should plan a Web services strategy, and your developers should familiarize themselves with Web services frameworks and tools. What exactly are Web services? 'Web services' describes a service-oriented architecture in which self-contained, distributed applications, comprising very specific business functions, are enveloped in XML to facilitate integration intraenterprise and with business partners. Using a global publish and lookup mechanism, these task-specific software services can be described, published, located, and engaged, either directly or programmatically, over the Internet and private networks. With Web services, you, your suppliers, trading partners, and customers will be able to dynamically discover and call one another's published applications and chain them together to automate entire workflow processes. Private exchanges and marketplaces, procurement, billing and payment verification, and legacy application availability all will benefit from the streamlining mechanisms of Web services. But companies won't reap these benefits overnight. Many, still trying to catch up to the adoption of XML, are hard-pressed to devote the necessary resources to break existing applications into discreet Web services components. But what is lost in XML's transactional inefficiency is made up for in the reduction of the complexity inherent in today's e-business systems. Development times, and consequently time to market, can be reduced from months to just days, systems can be made easier to maintain, and new revenue streams can be tapped thanks to improved application accessibility. Better still, Web services deliver these benefits not by supplanting the distributed technologies in use but by extending their functionality. Web services suffer from several popular misconceptions. One is that Web services are simply hosted applications. Another is that they are merely a new way of interfacing with another company's software. But Web services do more than merely expose interfaces, for this discounts the impact of run-time discovery and binding (the process of determining how the applications will interface), the key capabilities that separate Web services from other application integration methods. In the evolution of enterprise computing, the generic, object-oriented software components of yesterday yielded to standards such as COM (Component Object Model), CORBA, Enterprise JavaBeans (EJB), and then distributed server-side computing. But each new integration scheme continued to demand tightly coupled, agreed-on preconfigurations for exchanging data requirements that needed addressing at the point of design. If an interface was changed, the system was broken. The next generation of software components will be more loosely coupled, binding applications at execution instead of during development. The process will make interoperability a word of the past. How Web services work Web services frameworks use XML to envelop applications and facilitate messaging. The original executable can be coded in any language or run on any platform because Web services don't rely on object-model-specific protocols such as DCOM (Distributed Component Object Model), as do other component-based technologies. Services can be developed to offer multiple options for communication, selectable at the point of engagement. SOAP (Simple Object Access Protocol) is used to define distributed object communication procedures. The XML-based protocol carries additional instructions on how data should be processed and, like XML, is platform-and transport-neutral. WSDL (Web Services Description Language) provides an abstract for exchange by describing service-specific data including details on the interface, available protocols, and other implementation-specific particulars. And finally, the UDDI (Universal Description, Discovery, and Integration) specification at the repository level provides the indexing and lookup capability for services through DNS data and SOAP-based APIs. UDDI data contains information about businesses and their location, services they offer, billing information, and allowable protocols.... Although developers can see the great potential of Web services, CTOs can't cash in on the promise yet. Definitive standards, security, and QoS (quality of service) issues remain unanswered. Before CTOs can consider adoption seriously, problems of securing end-to-end data transport across multiple services and ensuring transactional committal, guaranteeing nonrepudiation, must be solved. The reality is that attempts at seamless interoperability will take some time..."

  • [May 14, 2001] "XML Schema becomes W3C Recommendation: What This Means. With the approval of the W3C and its 500+ members, XML is ready for the next big step to worldwide deployment." By Natalie Walker Whitlock (Casaflora Communications). From IBM developerWorks. May 2001. ['After more than two years of review and revision, the World Wide Web Consortium (W3C) announced on May 1 that it has embraced the XML Schema with a formal Recommendation. W3C Recommendation status is the final step in the consortium's standards approval process, indicating that the schema is a fully mature, stable standard backed by the 510 W3C member organizations.'] "Speaking at the 10th International World Wide Web Conference in Hong Kong, Web pioneer and W3C Director Tim Berners-Lee said that XML Schema (parts 0, 1 and 2) should now be considered as one of the foundations of XML, together with XML 1.0 and Namespaces in XML. He also stated that the specification provides 'an XML language for defining XML all languages.' The finalized Schema brings rich data descriptions to XML. Schema will solve the primary problem of B2B communication and interoperability that has held XML back from its full potential. The standardized Schema is expected to integrate data exchange across business, and ultimately realize the full promise of XML to facilitate and accelerate electronic business... Schema increases XML's power and utility to the developer by providing better integration with XML Namespaces. By introducing datatypes to XML, Schema makes it easier than ever to define the elements and attributes in a namespace, and to validate documents that use multiple namespaces defined by different schemas. XML Schema also introduces new levels of flexibility intended to speed its adoption for business use. According to [IBM's Noah] Mendelsohn, who also helped write the spec, XML Schema addresses a number of new issues and therefore has features for demanding apps. Yet, he says, developers can learn how to use XML Schema to do what they've been doing in XML with DTDs in 'about an hour or two.'... Berners-Lee added that XML Schema would need to be clarified and simplified after the many implementations and unexpected interpretations of the specification. Indeed, the cry of simplification has been one of the loudest heard from critics. The current complexity has been blamed for driving others to create alternative, lighter weight schemas, such as TREX and RELAX. Some have even said XML Schema is so complex that even some W3C insiders are calling for future versions to be incompatible with this first release so they do not repeat what critics say are the flaws of the first version... Despite the controversies, most groups have publicly stated that they will support and incorporate the W3C's XML Schema. These groups include IBM, Microsoft, Sun Microsystems, Commerce One, and Oracle. In a public statement, Oracle said its Oracle9i will be the first production database to implement the new Schema. In addition, both Microsoft's .Net initiative and Sun's SunOne Web services effort will take advantage of XML Schema..." For schema description and references, see "XML Schemas."

  • [May 12, 2001] "XRel: A Path-Based Approach to Storage and Retrieval of XML Documents Using Relational Databases." By Masatoshi Yoshikawa, Toshiyuki Amagasa, Takeyuki Shimura, and Shunsuke Uemura. In ACM Transactions on Internet Technology, Volume 1, Number 1 (June 2001). [Paper accepted for the inaugural issue. Edited by Won Kim.] "This paper describes XRel, a novel approach to storage and retrieval of XML documents using relational databases. In this approach, an XML document is decomposed into nodes based on its tree structure, and stored in relational tables according to the node type, with path information from the root to each node. XRel enables us to store XML documents using a fixed relational schema without any information about DTDs and element types, and also enables us to utilize indices such as the B+-tree and the R-tree supported by database management systems. For the processing of XML queries, we present an algorithm for translating a core subset of XPath expressions into SQL queries. Thus, XRel does not impose any extension of relational databases for storage of XML documents, and query retrieval based on XPath expressions can be realized in terms of a preprocessor for database query language. Finally, we demonstrate the effectiveness of this approach through several experiments using actual XML documents." See related slides in 'WebDB.html'.

  • [May 11, 2001] "Identification of Syntactically Similar DTD Elements for Schema Matching." By Hong Su, Sriram Padmanabhan, and Ming-Ling Lo (Computer Science Department, Worcester Polytechnic Institute). Paper presented at the Second International Conference on Web-Age Information Management (WAIM'2001), Xi'an, China, July. 2001. 13 pages. "XML Document Type Definition (DTD) enforces the structure of XML documents. XML applications such as data translation, schema integration, and wrapper generation require DTD schema matching as a core procedure. While schema matching usually relies on a human arbiter, we are aiming at an automated system that can give the arbiter a starting point for designing a matching that can best meet the requirements of the given application. We present an approach that identifies the syntactically similar DTD elements that can be potential matching components. We first describe DTD element graph, a data model for the DTD elements. We then define the distance between two DTD element graphs. We introduce the concept of syntactically equivalent and syntactically similar graphs. Then, we describe the algorithm to detect both schema equivalent and similar DTD elements. We have implemented the matching detection algorithm and several heuristics which improve performance. Our experimental results show reasonable precision of the algorithm in terms of recognition of correct matches... We need a mechanism to discover those semantically equivalent or similar components (we say these components match) to help generate an integrated conceptually clean schema. In this paper, we present a mechanism for detecting possible matches between components across DTDs. It consists of four phases: (1) DTDs are modeled as DTD graphs and a series of simplification transformations are performed to normalize the DTDs. (2) Initial matches are set up based on a series of matching criteria. (3) The initial matches are propagated by computing the matching likelihood of other component pairs based on their structural properties. (4) A 'best' matching plan is selected from multiple matching plans based on the overall matching likelihood of the component pairs... Conclusion and Future Work: XML based schema matching is likely to play an important role in e-commerce applications which rely on data integration and dynamic data sharing across distributed services (e.g., Emarkets). We have studied the problem of providing an automated tool for initial schema matching among DTDs. Our experimental evaluation using industry standard DTDs show that these algorithms are effective in identifying element and sub-tree matches. Renaming of elements reduces accuracy of the algorithms. However, this can be improved by using synonym dictionaries or domain-specific ontologies. We will be exploring these approaches as part of future work. Another issue that we would like to address is ambiguity. Ambiguity islikely to be very common when performing schema matching. For example, a one-to-one relationship can be modeled in several ways such as element and sub-element, or element and attribute, or element and an IDREF to another element. We would like to consider all alternatives as we perform the matching algorithm." Note description of Hong Su's research program: "A. Integration and management of semistructured web data. B. Complex model management, especially XML-based: (1) Underlying data publication: Export the data in non-XML data format to XML data format (2) Persist XML data in RDB, ODB databases (3) Evolution of XML. Many Applications of database systems come across the problems of manipulation of models. What we mean by model is a complex structure that represents a design artifact, such as relational schema, object model, UML model, XML DTD. The manipulation of models involve managing changes in models and transformations of data from one model to another which has to be addressed by the practitioners in schema integration and translation. We are studing how to represent model management in a data model that is able to capture the semantics of models and model mappings. C. Schema evolution in object databases: Schema evolution is one of the fundamental aspects of information and database systems. In the field of object database, we are studying the problems of how to specify complex schema changes, ensure structural consistency, support transparent schema changes that provide continued interoperability to active applications (behavioral consistency)." [cache]

  • [May 11, 2001] "XEM: Managing the Evolution of XML Documents." By Hong Su, Diane Kramer, Kajal T. Claypool, Li Chen, and Elke. A. Rundensteiner. Paper presented at the Eleventh International Workshop on Research Issues in Data Engineering (RIDE 2001): Document Management for Data Intensive Business and Scientific Applications [Heidelberg, Germany, April 1-2, 2001, Sponsored by the IEEE Computer Society, Held in conjunction with the 17th International Conference on Data Engineering - ICDE 2001. [In Paper Session Four: XML Document Versioning and Change Management; Chair: Susanne Boll, University of Vienna, Austria.] In Proceedings Eleventh International Workshop, pages 103-110. "As information on the World Wide Web continues to proliferate at an astounding rate, the Extensible Markup Language (XML) has been emerging as a standard format for data representation on the Web. In many applications, specific document type definitions (DTDs) are designed to enforce a semantically agreed-upon structure of the XML documents for management. However, both the data and the structure of XML documents tend to change over time for a multitude of reasons, including to correct design errors in the DTD, to allow expansion of the application scope over time, or to account for the merging of several businesses into one. However most of the current software tools that enable the use of XML do not provide explicit support for such data or schema changes. In this vein, we put forth the first solution framework, called XML Evolution Manager (XEM) to manage the evolution of XML. XEM provides a minimal yet complete taxonomy of basic change primitives. These primitives, classified as either data changes or schema changes, are consistency-preserving, i.e., for a data change, they ensure that the modified XML document conforms to its DTD both in structure and constraints; and for a schema change, they ensure that the new DTD is a valid DTD and all existing XML documents are transformed also to conform to the modified DTD. We prove the completeness of the taxonomy in terms of DTD transformation. To verify the feasibility of our XEM approach we have implemented a working prototype system using PSE Pro as our backend storage system... XML has become increasingly popular as the data exchange format over the Web. Although XML data is self-describing, most application domains tend to use Document Type Definitions (DTDs) to specify and enforce the structure of XML documents within their systems. DTDs thus assume a similar role as types in programming languages and schemata in database systems. Many systems, such as Oracle 8i, IBM DB2, and Excelon, have recently started to enhance their existing database technologies to accommodate and manage XML data. Many of them assume that the DTD is provided in advance and will not change over the life of the XML documents. They hence utilize the given DTD to construct a fixed relational (or object-relational) schema which then can serve as structure based on which to populate the XML documents that conform to this DTD. However, change is a fundamental aspect of persistent information and data-centric systems. Information over a period of time often needs to be modified to reflect perhaps a change in the real world, a change in the user's requirements, mistakes in the initial design or to allow for incremental maintenance. While these changes are inevitable during the life of an XML repository, most of the current XML management systems unfortunately do not provide enough (if any) support for these changes. Motivating Example of XML Changes. . . XML Evolution Manager (XEM) Approach: in this work we propose an XML Evolution Manager (XEM) as a middleware solution that provides uniform XML-centric data and schema evolution facilities. To the best of our knowledge, XEM is the first effort to provide such uniform evolution management for XML documents. In brief the contributions of our work are: (1) We identify the lack of generic support for change in current XML management systems; (2) We propose a taxonomy of XML evolution primitives that provide a system independent way to specify changes both at the DTD and XML data level. (3) We analyze change semantics and introduce the notion of constraint checking to ensure structural consistency during the evolution; (4) We can show that our proposed change taxonomy is complete; (5) We describe a working XML Evolution Management prototype system we have implemented to verify the feasibility of our approach... System Implementation: To verify the feasibility of our approach, we have implemented the ideas presented in this paper in a prototype system. We have implemented Marrow, a working framework for XML management. In Marrow we use Excelon Inc's Pse Pro, a lightweight object database system repository, as the underlying persistent storage system for XML documents. We require that the DTDs with which the incoming XML documents will comply are entered first into the system. PSE Pro's schema repository has been enhanced to not only manage traditional OO schema but also DTD as metadata. The DTD-OO schema mapper generates an OO schema according to the DTD metadata. Then we load the XML documents into the just prepared schema. The mapping and loading details are given in [Kramer]. We implemented all the proposed change primitives. Comparison of the performance of using the primitives to achieve incremental change versus reloading from scratch can be found in [Kramer]... In this paper, we present the first of its kind: a taxonomy of XML evolution operations. These primitives assure the consistency of XML documents, both when DTD changes are made and XML documents have to conform to the changes; and also when individual XML documents are changed to ensure that the changed documents still correspond to the specified DTD. We have implemented an XEM prototype system. The performance analysis can be found in [Kramer: D. Kramer. XML Evolution Management, Masters Thesis, Worcester Polytechnic Institute, 2001]." Also in PDF format. [cache PS, cache PDF]

  • [May 11, 2001] "Model Management: A Solution to Support Multiple Data Models, Their Mappings and Maintenance." By Kajal T. Claypool, Elke A. Rundensteiner, Xin Zhang, Su Hong, Harumi Kuno, Wang-chien Lee, and Gail Mitchell. In Proceedings of SIGMOD 2001 (Santa Barbara, CA, May 2001). 5 pages. "The growing accessibility of the Internet has brought about phenomenal growth in the publication of data. The data comes from heterogeneous distributed sources -- even within a company, similar data can be stored in a variety of formats. The growth of the Internet has also increased the need for applications that present a unified view of data from multiple sources, and thus the problem of how to integrate heterogeneous data is more important than ever. Although the problem of data exchange and integration has been well-studied for many years, in the era of electronic information exchange, data model impedance presents new critical challenges. Today, for example, we need to: 1) map an XML schema of one Web application to that of another to guide the exchange of XML instances between the applications; 2) map a web page wrapper to a database schema to guide the translation of queries on the schema to the underlying web sites; 3) map a web site's content to its page layout to drive the generation of web pages. Many projects in industry and academia have been and are continuing to struggle with this problem. However they typically tackle an individual slice of the overall problem, and develop and reinvent many (often very similar) tools for their specific domains. In this project, we present a model management system that provides a complete and integrated solution to the large problem, as well as an infrastructure for creating advanced tools and operators. Our system is capable of (1) describing different data models such as relational, OO and XML; (2) describing cross-model mappings to map for example a XML schema to a relational schema and to drive transformation of XML elements into rows of relational tables; (3) describing inter-model mappings, i.e., restructuring within one model, to map data sources into data warehouse tables; and (4) discovering mappings with the aid of pre-defined maps and additional domain knowledge between two given schemas to aid e-business to communicate with their own individual XML documents. With the aid of this middle-layer management tool, users can now describe the application schemas they are working with. Moreover, they can map from the application schema which may be in the relational model to an equivalent XML schema, can describe an XML, relational or OO view over it or can even discover the mapping between two pre- existing XML Schemas. A map or a discovered map are all represented in the model management system as first-class citizens thereby allowing users to operate and manipulate them. Generic tools and operators can now be built for the map models with a promise to provide an environment for re-use of technology and effort for meta- mappings and modeling of application schemas. As a proof of concept we provide a change management tool for our MM system. This allows modication of existing maps, i.e., maps in the MMS, to reflect schema changes in the source and/or the target irrespective of the type of map (cross-model or inter-model) and the data model of the source and target. Thus, the tool provides maintenance of a map transforming between the source and target schema when either one of them undergoes a schema change... In Section 2, we give a brief overview of our MM system and walk-through the steps involved in developing a mapping. Section 3 presents highlights of our system in terms of its features. Section 4 gives a brief look at our plans for demo. ... Our MMS prototype has been written in Java JDK 1.2 and uses Oracle8i as the MMS persistent store. We use the Oracle8i triggers to develop the change management tool and utilize the IBM XML Parser 4J to extract XML data using the DOM API. Our examples for the demo focus on cross-model translations from XML to relational and vice versa; and on inter-model re-structuring of the relational model. The MMS incorporates a full range of tools that allow users to describe their application schemas, to re-structure their schemas, map from a schema in one model to a schema in another model, and maintain once these maps are in place. We will demonstrate: (1) importing an application schema e.g., DTD from the application layer and representing it as data in the meta layer; (2) translating application schema (DTD) to an application schema in a different model (relational model) using several pre-dened maps; (3) re-structuring an application schema (DTD or relational) using pre-defined complex SPJ maps; (4) discovering maps between two application schemas in the same data model (DTD) using pre-defined maps and domain knowledge; (5) editing generated maps using a map editor; (6) generating code for the map to drive the schema translation as well as the data transformation using the specications given in the map; (7) propagating schema change from source (relational) to target (relational) and vice versa by allowing in-place modication of the map." See also the Database Systems Research Group (DSRG) web site. [cache PS, cache PDF]

  • [May 11, 2001] "Version Management of XML Documents: Copy-Based versus Edit-Based Schemes." By Shu-Yao Chien, Vassilis Tsotras (UCR), and Carlo Zaniolo (UCLA). Paper presented at the Eleventh International Workshop on Research Issues in Data Engineering (RIDE 2001): Document Management for Data Intensive Business and Scientific Applications [Heidelberg, Germany, April 1-2, 2001, Sponsored by the IEEE Computer Society, Held in conjunction with the 17th International Conference on Data Engineering - ICDE 2001. [In Paper Session Four: XML Document Versioning and Change Management; Chair: Susanne Boll, University of Vienna, Austria.] "Managing multiple versions of XML documents and semistructured data represents a problem of growing interest. Traditional version control methods, such as RCS, use edit scripts representing changes in the document to support the incremental reconstruction of different versions. The edit-based approaches have been recently enhanced with a replication scheme called UBCC. UBCC is based on the notion of page usefulness and ensures effective management for multi-version documents in terms of both retrieval and storage cost. These improvements notwithstanding, the edit-based representation suffers from limited generality and flexibility -- e.g., it cannot represent changes such as rearranging the document or duplicating parts of its content. To solve these problems, the paper proposes a copy-based UBCC versioning scheme, which also provides a simpler format for the electronic exchange of multi-version documents. With the objective of matching the performance of the edit-based UBCC technique, we develop algorithms that enhance the copy-based UBCC scheme with page usefulness management. We also present results of various experiments that test the storage and retrieval performance of the new copy-based approach, and compare it with that of the edit-based UBCC approach...The problem of managing multiple versions for XML and semistructured documents is of significant interest for content providers and cooperative work. The XML standard is considering this problem at the transport level. The WEBDAV working group is developing a standard extension to HTTP to support version control, meta data and name space management, and overwrite protection. Traditional document version management schemes, such as RCS and SCCS, are line-oriented and suffer from various limitations and performance problems. For instance, RCS stores the most current version intact while all other revisions are stored as reverse editing scripts. These scripts describe how to go backward in the document's development history. For any version except the current one, extra processing is needed to apply the re-verse editing script to generate the old version. Instead of appending version differences at the end like RCS, SCCS interleaves editing operations among the original document source code and associates a pair of timestamps (version ids) with each document segment specifying the lifespan of that segment. Versions are retrieved from an SCCS file via scanning through the file and retrieving valid segments based on their timestamps... However, in spite of these improvements, the edit-based representation of versions suffers from limited generality and flexibility. For example, it can not efficiently represent changes such as document content rearranging and docu-ment restructuring. To solve these problems, we propose a new copy-based versioning scheme, which gets rid of edit scripts and uses the concept of common segments to represent versions. This new scheme also provides a simpler, more flexible format which can be used for the electronic exchange of multi-version documents, WWW-based cooperative authoring and versioning activities. In addition, with the objective of matching the UBCC's performance, we develop algorithms that enhance the copy-based scheme with the usefulness-base page management method used in UBCC. After formalizing the algorithms used in the two methods, we present the results of various experiments to test and compare the performance of these two strategies... Due to the growing importance of versioned XML documents, we have been seeking strategies for optimizing their storage and retrieval. In a previous paper, we concentrated on edit-based representations and proposed a usefulness-based management technique (UBCC) that provides better overall performance and flexibility than more traditional version control methods such as RCS. As discussed, the edit-based UBCC for multi-version documents achieves performance levels that are typically better than those obtainable using techniques developed for transaction-time databases and persistent objects managers. In this paper we developed a copy-based representation technique that in terms of generality and flexibility of representation is superior to the edit-based representations favored by all previous authors. A main contribution of this paper has been to extend the usefulness based management to our new copy-based scheme, as to achieve the same level of performance on storage and retrieval as that obtained using the edit-based UBCC. Our copy based scheme stores and retrieves each version as a list of sublists without using edit scripts. This new scheme minimizes the version retrieval I/O overhead and offers the following advantages: (1) changes such as document reorganization, and replication of selected document objects are supported along with the traditional insertion, deletion and updates supported by the edit scripts; (2) list representation (unchanged segment records) are stored with actual objects, eliminating the need for a separate edit script. Only net effect of changes are used, and intermediate changes are factored out, (3) multiple concurrent versions can be supported along with successive temporal versions." See the previous work by S-Y. Chien, V.J. Tsotras, and C. Zaniolo: "Version Management of XML Documents." WebDB 2000 Workshop, Dallas, TX., alt URL. [cache]

  • [May 11, 2001] "Chemical Markup Language. A Position Paper." By Peter Murray-Rust (Peter.Murray-rust@nottingham.ac.uk) and Henry S. Rzepa (rzepa@ic.ac.uk). 2001-04-10. "This paper describes Chemical Markup Language and its relationship to IUPAC and other organisations... CML deliberately does not cover all chemistry but concentrates on 'molecules' (discrete entities representatable by a formula and usually a connection table). It supports a hierarchy for compound molecules (clathrates, macromolecules, etc.). It also supports reactions, and macromolcular strucures/sequences (though it can interoperate with other macromolecular XML languages as they are developed). It has no specific support for physicochemical concepts, but can support labelled numeric datatypes of several sorts which can cover a wide range of requirements. It allows quantities and properties to be specifically attached to molecules, atoms or bonds. CML is designed to interoperate with several leading MLs and XML protocols and we have demonstrated the following (1) XHTML for text and images; (2) SVG for line diagrams, graphs, reaction schemes, phase diagrams, etc.; (3) PlotML for graphs MathML for equations; (4) XLink for hypermedia (including atom-spectralPeak assignments, reaction mapping); (5) RDF and Dublin Core for metadata; (6) XML Schemas for numeric and other data types. There are other generic tools required in physical science including units, multidimensional arrays with varied datatypes, terminology and bibliography. There are no widely accepted MLs for these at present; we shall continue to develop our own to be used with CML but will use others if they become widespread. An example is physiochemical data held as SELF (Prof. Henry Kehiaian, IUPAC+CODATA) and now converted to SELFML (PMR+HK) as a IUPAC/CODATA project... Many different types of organisation have adopted, or are adopting CML. We list a few examples: (1) Governmental and global agencies (e.g., drug regulatory agencies through the International Committee on Harmonisation - ICH/M2). We have had additional meetings or discussions with several other agencies. Non-profit research (government). National Cancer Institute, Developmental Therapeutics program (NCI/DTP). ca. 500K compounds are being converted to CML Non-profit research (academia). (2) The University of California at San Diego (UCSD) has adopted CML as the chemical technology for its new terascale information and computing grid portals. This will also by used by the Protein Data Bank (PDB) at the same site..." For additional information, see (1) the Chemical Markup Language official web site, and (2) "Chemical Markup Language (CML)."

  • [May 11, 2001] "Chemical Markup Language 1.0 reference with examples." A Zvon resource. Written by Jiri Jirat. The indexes were extracted from the CML 1.0 specification. Main features: (1) Clickable indexes; (2) Graphical representation of examples: PNG and SVG created from CML using XSLT, both molecules and spectra; (3) Click on an atom in the example leads to relevant part of CML source."

  • [May 11, 2001] "The Role of Private Uddi Nodes in Web Services, Part 1. Six species of UDDI." By Steve Graham (Web Services Architect, IBM Emerging Internet Technologies; previously: faculty member in the Department of Computer Science, the University of Waterloo). From IBM developerWorks. May 2001. ['Steve Graham introduces the concepts behind Web services discovery and gives a brief overview of UDDI (Universal Description Discovery and Integration). He examines six variants of UDDI registries, highlighting the role each of these plays in a service-oriented architecture.'] "In service-oriented architectures, service descriptions and metadata play a central role in maintaining a loose coupling between service requestors and service providers. The service description, published by the service provider, allows service requestors to bind to the service provider. The service requestor obtains service descriptions through a variety of techniques, from the simple "e-mail me the service description" approach and the ever popular sneaker-net approach, to techniques such as Microsoft's DISCO and sophisticated service registries like the Universal Description, Discovery and Integration (UDDI), which is what I am going to examine here. UDDI defines four basic data elements within the data model in version 1.0: businessEntity (modeling business information), businessService (high level service description), tModel (modeling a technology type or service type), and bindingTemplate (mapping between businessService and tModels). I won't discuss the intricacies behind these elements, so if you need more basic information on UDDI, please visit the UDDI web site before continuing (see Resources). The set of operator nodes known as the UDDI business registry, or UDDI operator cloud, implies a particular programming model characterized by design-time service discovery. We need design-time discovery since it is often not feasible to implement dynamic discovery at run-time due to overwhelming complexity. However, the just-in-time integration value proposition of the IBM Web Services Initiative allows organizations to provide dynamic discovery and binding of Web services at run time. To do this, API characteristics and other non-functional requirements are specified as business policies at design time. This flexibility has important characteristics for loosely-coupled enterprise application integration, both within and between organizations. The role of the UDDI cloud to support a dynamic style of Web services binding is currently limited. However, the UDDI API and data model standard can still play a role in a service-oriented architecture. The notion of a private or non-operator UDDI node is critical to the emergence of a dynamic style of a service-oriented architecture... We have briefly examined the discovery role played by UDDI within a service-oriented architecture and enumerated six species of UDDI, each supporting different uses of a service-oriented architecture. In the next installment of this article, I will contrast the programming models that use private and operator UDDI nodes, and review requirements for functionality to make private UDDI nodes easier to use." Article also in PDF format. See: "Universal Description, Discovery, and Integration (UDDI)." [cache]

  • [May 11, 2001] "XML Meets Semantics. Meet the new kids on the block, and one more from the old neighborhood. [Thinking XML #2.]" By Uche Ogbuji (CEO and principal consultant, Fourthought, Inc.). From IBM developerWorks. May 2001. ['Addresses knowledge management aspects of XML, including metadata, semantics, Resource Description Framework (RDF), Topic Maps, and autonomous agents. In this column, Uche Ogbuji completes his introduction to XML and semantics, setting the stage for the more practical columns that will follow. Thinking XML addresses knowledge management aspects of XML, including metadata, semantics, Resource Description Framework (RDF), Topic Maps, and autonomous agents. Approaching the topic from a practical perspective, the column aims to reach programmers rather than philosophers.'] "In my first Thinking XML column, I introduced the idea of semantic transparency and its importance to XML-related developments. Because semantic transparency is so important, there has been a flurry of activity in the area lately -- more than I could cover in one installment. In this installment, I introduce some of the emerging players in XML and semantics. But first, I'll cover an interesting play by the old guard which I omitted from the first installment... XML markup for EDI standards: The Implementation Guide Mark Up (IgML) working group is an effort by a group of electronic-data interchange (EDI) vendors to represent EDI implementation guidelines and standards in XML format. They are developing a DTD (document type description) for this representation, with the goal of providing a high degree of structure to the normative text and directing the implementation path to EDI for maximum interoperability. From the IgML Web site you can download the current draft of the DTD, as well as samples of various subsets of ANSI X12 and UN/EDIFACT (the two main 'dialects' of EDI). While IgML does not itself provide a framework for semantic transparency, it will provide a useful tool for those implementing XML business-to-business systems that either work with EDI or just take advantage of the semantic infrastructure provided by EDI standards... [also covered: ebXML, eCo registries, RosettaNet, RDF...] Now that I've outlined the importance of semantic frameworks as a layer above XML, future columns will move on to examining practical ways to manage the knowledge represented by these high-level frameworks. The next article will discuss the use of RDF to develop inexpensive search and reference systems for XML data repositories." Article also in PDF format. See: "XML and 'The Semantic Web'."

  • [May 11, 2001] "ebXML: E-Business Language of Choice? How an XML Specification Strives to Create One Global Market." By Don Kiely. In InformationWeek Isue 836 (May 07, 2001), pages 79-84. "The grassroots ebXML effort has created a set of standard business processes using XML to create a global online marketplace. The challenge is making a framework that's generic, yet sweeping enough to accommodate large and small firms around the world. bXML, the United Nations-backed standard for E-business, aims to create a single online marketplace where companies of any size or nationality can collaborate and conduct business around the globe. By creating a standard way for companies to carry out common business practices, ebXML promoters hope to lower entry barriers and let small and midsize companies from the far corners of the globe join in the economic advances that their larger brethren already enjoy. It's ironic that this grassroots effort had its genesis in NATO, the North Atlantic Treaty Organization -- a pan-governmental bureaucracy. But the initiative seems to be resonating around the world, with supporters ranging from IBM and Sun Microsystems to government agencies such as the Saudi Export Development Corp. and small businesses like Martin's Famous Pastry Shoppe. The goal of ebXML -- being undertaken by about 1,000 participating organizations -- is to create a set of standards that will let companies use XML for E-business. The underlying tenet of ebXML is business workflow and common business processes that every business should be able to understand and use. You could think of ebXML as the successor to electronic data interchange. Where EDI delineated standard E-business documents such as purchase orders, ebXML specifies common business processes and an architecture for carrying out those processes over the Internet. EbXML is being spearheaded by the United Nations Organization for the Advancement of Structured Information Standards, as well as the Economic Commission for Europe's Centre for Trade Facilitation and Electronic Business. EbXML is nearing the end of its planned 18-month gestation period this month, with the publication of a complete set of specifications for using XML as the communication format for global business. The fate of ebXML after May is still undecided, but if the roster of computer vendors and consulting companies participating in the work is any indication, ebXML is likely to make its way into software and service offerings during the coming months. The draft ebXML architecture specifications describe a complex infrastructure for interactions between trading partners and a repository of XML documents from which business processes can be modeled. The architecture provides: (1) A way to define business processes and their associated messages and content; (2) A way to register and discover business process sequences with related message exchanges; (3) A way to define company profiles; (4) A way to define trading-partner agreements; (5) A uniform message transport mechanism... Several major security requirements must be addressed for ebXML to be accepted: (1) Confidentiality: Only the sender and receiver can interpret a document's contents; (2) Authentication of sender: Assurance of the sender's identity; (3) Authentication of receiver: Assurance of the receiver's identity; (4) Integrity: Assurance that the message contents haven't been altered while en route; (5) Nonrepudiation of origin: The sender cannot deny having sent the message; (6) Nonrepudiation of receipt: The receiver can't deny having received the message; and (7) Archiving: It must be possible to reconstruct the semantic intent of a document several years after the creation of the document. The biggest challenge of ebXML is to create a framework for automating trading-partner interactions that's both generic enough for implementation across the entire range of business processes and expressive enough to be more effective than ad hoc implementations between trading partners. The ebXML specification for the application of XML-based assembly and context rules describes how business rules are formed. Given that companies around the world operate in many different ways, it's unlikely that any single standard could possibly incorporate those many variations. No matter where ebXML heads after May, there will be some useful designs for global business interchange. Even if ebXML fizzles, it will have been a useful exercise that can make the world even smaller than it already is." In the same context: Differences between ebXML and UDDI. See references in "Electronic Business XML Initiative (ebXML)."

  • [May 11, 2001] "XML Databases Offer Greater Search Capabilities." By Charles Babcock. In Interactive Week Volume 8, Number 18 (May 01, 2001), pages 11-13. "The Extensible Markup Language is emerging not only as a Web page markup standard, but as a database technology with the potential to simplify and speed future Web operations. With databases that store whole documents in their native XML format, an archive becomes easier to search by title, author, keywords or other attributes. The development will broaden information that is available over the Web and make speedy content serving more practical, database experts said. The World Wide Web Consortium (W3C) last week released its XML Schema specification, which defines how to use XML -- a larger and more useful tagging language than its predecessor, HTML. At the same time, pioneering efforts to implement XML in database systems for managing XML documents are gaining steam. Software AG leads the field with its Tamino XML Database, and 9-month-old start-up Ipedo announced its own XML Database System last week. In the meantime, relational database vendors IBM, Oracle and Sybase continue to upgrade their products to give them more XML-handling capabilities... Both Ipedo and Software AG implement their own versions of the W3C's proposed specification for the XML Query language, now known as XQuery for short. The XQuery draft specification was released Feb. 16, 2001. Once it becomes a released specification, the use of XML documents and XML databases will proliferate, experts predicted. Ipedo is trying to capitalize on speed by urging its customers to equip their database servers with a gigabyte or more of memory. The Ipedo XML Database System dispenses with many of the time-consuming input/output operations of traditional databases by having the database engine and much of the data it works with reside in main memory. The move adds $1,500 or more to the cost of the server on which the database resides, but augments the speed already inherent in serving XML documents from an XML database, Matthews said. Software AG of Darmstadt, Germany has sold 300 copies of its mainframe-style Tamino product since the system was launched in 1999. 'Content delivery is one of our greatest strengths,' said John Taylor, Software AG's director of product marketing for Tamino. He conceded that customers wouldn't buy an XML database primarily to manage large financial accounts. On the other hand, Taylor added, emerging query languages such as XQuery, which was co-authored by IBM and Software AG, will make it possible to query the XML database using 'keys' and retrieve related information from a variety of documents. Just as Structured Query Language queries the relational database, pulling out data related to a primary key or identifier, XQuery will be able to query a large set of documents based on the name of an author, date filed, subject or keywords in the document, Taylor said..."

  • [May 11, 2001] "Efficient Evaluation of XML Middle-ware Queries." By Mary Fernández (AT&T Labs), Atsuyuki Morishima (University of Tsukuba), and Dan Suciu (University of Washington). Paper presented at ACM SIGMOD/PODS 2001, Santa Barbara, California, May 21-24, 2001. 12 pages. "We address the problem of efficiently constructing materialized XML views of relational databases. In our setting, the XML view is specified by a query in the declarative query language of a middle-ware system, called SilkRoute. The middle-ware system evaluates a query by sending one or more SQL queries to the target relational database, integrating the resulting tuple streams, and adding the XML tags. We focus on how to best choose the SQL queries, without having control over the target RDBMS... XML is the universal data-exchange format between applications on the Web. Most existing data, however, is stored in non-XML database systems, so applications typically convert data into XML for exchange purposes. When received by a target application, XML data can be re-mapped into the application's data structures or target database system. Thus, XML often serves as a language for defining a view of non-XML data. We are interested in the case when the source data is relational, and the exchange of XML data is between separate organizations or businesses on the Web. This scenario is common, because an important use of XML is in business-to-business (B2B) applications, and most business-critical data is stored in relational database systems (RDBMS). This scenario is also challenging, be cause the mapping from the relational model to XML is inherently complex and may be difficult to compute efficiently. Relational data is flat, normalized (3NF), and its schema is often proprietary. For example, relation and attribute names may refer to a company's internal organization, and this information should not be exposed in the exported XML data. In contrast, XML data is nested, unnormalized, and its schema (e.g., a DTD or XML Schema) is public. The mapping from the relational data to XML, therefore, usually requires nested queries, joins of multiple relations, and possibly integration of disparate databases. In this work, we address the problem of evaluating efficiently an XML view in the context of SilkRoute, a relational to XML middle-ware system. In SilkRoute, a relational to XML view is specified in the declarative query language RXL. An RXL query has constructs for data extraction and for XML construction. We are interested in the special case of materializing large RXL views. In practice, large, materialized views may be atypical: often the XML view is kept virtual, and users' queries extract small fragments of the entire XML view. For example, SilkRoute supports composition of user-defined queries in XML-QL and virtual RXL views and translates the composed queries into SQL. SilkRoute's query composition algorithm is described elsewhere. Our goal is to support data-export or warehousing applications, which require a large XML view of the entire database. In this case, computing the XML view may be costly, and query optimization can yield dramatic improvements...In our scenario, the XML document defined by an RXL view typically exceeds the size of main memory, therefore, the sorted, outer-union approach best suits our needs. This approach constructs one large, SQL query from the view query; reads the SQL query's resulting tuple stream; and then adds XML tags. The SQL query consists of several left-outer joins, which are combined in outer unions. The resulting tuples are sorted by the XML element in which they occur, so that the XML tagging algorithm can execute in constant space. SilkRoute initially used a more naive approach, in which the view query was decomposed into multiple SQL queries that do not contain outer joins or outer unions. Each result is sorted to permit merging and tagging of the tuples in constant space. We call this the fully partitioned strategy. This work makes two contributions. First, we show experimentally that neither of the above approaches is optimal. This is surprising for the sorted outer-union strategy, because only one SQL query is generated, and therefore has the greatest potential for optimization by the RDBMS. In experiments on a 100MB database, we found that the outer-union query was slower than the queries produced by the fully-partitioned strategy. We found that the optimal strategy generates multiple SQL queries, but fewer than the fully partitioned strategy, therefore the optimal SQL queries may contain outer joins and outer unions. XML tagging still uses constant space, because it merges sorted tuple streams. The optimal strategy executes 2.5 to 5 times faster than the sorted outer-union and fully-partitioned strategies... Generating SQL queries from an XML view definition is a tedious task, and as we have shown, different SQL-generation strategies dramatically effect query-evaluation time. These observations indicate that the user of a relational-to-XML publishing system should not be responsible for choosing SQL queries. To better support large XML views, we presented a method that decomposes the XML view definition into several, smaller SQL queries and submits the decomposed SQL queries to the target database. Our greedy algorithm for decomposing an XML view definition relies on query-cost estimates from the target query optimizer. This method works well in practice and generates execution plans that are near optimal. Although particularly effective in an XML middle-ware system, our view-tree representation can encompass the view-definition languages of commercial relational-to-XML systems. Commercial systems typically generate XML in-engine, because the cost of binding application variables to the tuples dominates execution time. Our decomposition method could be applied within a relational query optimizer as a preprocessing step to XML publishing of relational data in-engine. This work is focussed on publishing large XML documents in an environment in which the middle-ware system has no control over the physical environment or query optimizer of the target database. Given these constraints, our greedy algorithm for searching for optimal query plans is necessary and effective. The simpler outer-union strategy, however, might be adequate when the middle-ware system has more control over the target database. SilkRoute's generated optimal plans do better than the unified outer-union plan, because each individual query is smaller than the outer-union plan. Small queries are less likely to stress the query optimizer; they sort smaller result relations and therefore are less likely to spill tuples to disk; and they typically have many fewer null values than a unified query. An outer-union plan can be reduced by hand, which would provide the same benefits as automatic view-tree reduction. Assuming that the target database has plentiful memory and/or multiple disks, and efficiently supports null values, the resulting outer-union plan is likely to be comparable to SilkRoute's generated optimal plans. Finally, the outer-union plan may also be appropriate when a user query requests only a subset of the XML view, and the result document is small. In this scenario, the outer-union strategy should work well, because the resulting SQL query is usually simple. This scenario is considered in [SilkRoute: Trading], where the XML view of the database is virtual, and users query it using XML-QL." See also: Mary Fernández, Dan Suciu, and Wang-Chiew Tan: "SilkRoute: Trading Between Relations and XML," in Proceedings of WWW9 (2000). On XML query: "XML and Query Languages."

  • [May 11, 2001] "DTDs and XML Documents from SQL Queries. [XML Matters #9.]" By David Mertz, Ph.D. (Bricolateur, Gnosis Software, Inc.) From IBM developerWorks. May 2001. ['This column discusses the public-domain sql2dtd and sql2xml utilities that allow RDBMS-independent generation of portable XML result sets. SQL queries that extract data from relational databases can provide very practical ad hoc document-type information for the representation of query results in XML.'] "The previous "XML Matters" column discussed some of the theory and advantages underlying various data models. One conclusion of that column was that RDBMSs are here to stay (with good reasons), and that XML is best seen in this context as a means of transporting data between various DBMSs, rather than as something to replace them. XPath and XSLT are useful for certain "data querying" purposes, but their application is far less broad and general than that of RDBMSs, and SQL, in particular. However, for lack of space, I am deferring a discussion of the specific capabilities (and limits) of XPath and XSLT until a later column. A number of recent RDBMSs, including at least DB2, Oracle, and probably others, come with built-in (or at least optional) tools for exporting XML. However, the tools discussed in this column are intended to be generic; in particular, the DTDs generated by these tools will remain identical for the same query performed against different RDBMSs. I hope this will further goals of data transparency. Simplifying too much What you might imagine as the most obvious way to convert relational database data to XML is also generally a bad idea. That is, it would be simple enough -- conceptually and practically -- to do a table-by-table dump of all the contents of an RDBMS into corresponding XML documents... Suppose that A and B each has its own internal data storage strategy (for example, in different RDBMSs). Each maintains all sort of related information that is not relevant to the interaction between A and B, but they also both have some information they would like to share. Suppose, along these lines, that A needs to communicate a particular kind of data set to B on a recurrent basis. One thing A and B can do is agree that A will periodically send B a set of XML documents, each of which will conform to a DTD agreed to in advance. The specific data in one transmission will vary with time, but the validity rules have been specified in advance. Both A and B can carry out their programming, knowing the protocol between them. One way to develop this communication between A and B is to develop DTDs (or schemas) that match the specific needs of A and B. Then A will need to develop custom code to export data into the agreed DTDs from A's current RDBMS; and B will need to develop custom code to import the same data (into a differently structured database). Then, finally, the communication channel can be opened. However, a quicker way -- a way that is likely to leverage existing export/import procedures -- usually exists. The Standard Query Language (SQL) is a wonderfully compact means of expressing exactly what data interests you within an RDBMS database. Trying to bolt XML native techniques like XPath or XSLT onto a relational model will probably feel unnatural, although they can certainly express querying functions within XML's basically hierarchical model. Many organizations have already developed well-tested sets of SQL statements for achieving known tasks. Often, in fact, RDBMSs provide means for optimizing stored queries. While there are certainly cases where designing rich DTDs for data exchanges makes sense, in many or most cases, using the structuring information implicit in SQL queries as an (automatic) basis for XML data transmissions can be a good solution. While SQL queries can combine table data in complex ways, the result from any SQL query is a rather simple row-and-column arrangement. Query output has a fixed number of columns, with each row filling in values for every fixed column. (That is, as well as not changing in number, neither the value type nor the names of columns change within a SQL result -- even though both these things could change in XML documents.) The potential of XML to represent complex nesting patterns of elements is just simply not going to be deeply exercised in representing SQL results. Nonetheless, several important aspects of an SQL query can and should be represented in an XML DTD beyond simply row/column positions... In general, sql2dtd can generate the DTD from an SQL query but does not itself query any database. sql2xml peforms queries via ODBC and optionally utilizes sql2dtd to get a DTD (or it can generate DTD-less XML documents). These tools help with only approximately half the process contemplated between A and B. A and B can quickly arrive at DTDs using these tools, and A can equally quickly generate the output XML documents conforming with these DTDs. But B, at its end, still needs to do all the work involved in parsing, storing and processing these received documents. Later columns will discuss B's job in some more detail." See references in "XML and Databases."

  • [May 11, 2001] "Tutorial: Mapping DTDs to Databases." By Ronald Bourret. From XML.com. May 09, 2001. ['XML and database expert Ron Bourret discusses mapping DTDs to database schemas, and vice versa. In his in-depth article, Bourret discusses both table-based and object-relational mappings. The article describes best practices.'] "A common question in the XML community is how to map XML to databases. This article discusses two mappings: a table-based mapping and an object-relational (object-based) mapping. Both mappings model the data in XML documents rather than the documents themselves. This makes the mappings a good choice for data-centric documents and a poor choice for document-centric documents. The table-based mapping can't handle mixed content at all, and the object-relational mapping of mixed content is extremely inefficient. Both mappings are commonly used as the basis for software that transfers data between XML documents and databases, especially relational databases. An important characteristic in this respect is that they are bidirectional. That is, they can be used to transfer data both from XML documents to the database and from the database to XML documents. One consequence is that they are likely to be used as canonical mappings on top of which XML query languages can be built over non-XML databases. The canonical mappings will define virtual XML documents that can be queried with something like XQuery. In addition to being used to transfer data between XML documents and databases, the first part of the object-relational mapping is used in "data binding", the marshalling and unmarshalling of data between XML documents and objects... Most XML schema languages can be mapped to databases with an object-relational mapping. The exact mappings depend on the language. DDML, DCD, and XML Data Reduced schemas can be mapped in a manner almost identical to DTDs. The mappings for W3C Schemas, Relax, TREX, and SOX appear to be somewhat more complex. It is not clear to me that Schematron can be mapped. In the case of W3C Schemas, a complete mapping to object schemas and then to database schemas is available. Briefly, this maps complex types to classes (with complex type extension mapped to inheritance) and maps simple types to scalar data types (although many facets are lost). "All" groups are treated like unordered sequences and substitution groups are treated like choices. Finally, most identity constraints are mapped to keys. For complete details, see http://www.rpbourret.com/xml/SchemaMap.htm." See also by Ronald Bourret: (1) "XML and Databases," and (2) "XML Database Products." Reference list: "XML and Databases."

  • [May 11, 2001] "Reports from WWW10." By Edd Dumbill. From XML.com. May 09, 2001. [Highlights from the 10th International World Wide Web conference, which took place last week in Hong Kong. The reports feature Tim Berners-Lee's keynote, Web multimedia, the problems of deploying XHTML, and web annotations with Annotea.'] "Opening the conference on Wednesday, Tim Berners-Lee told the attendees they could congratulate themselves for the progress made so far on the Web, but that they weren't finished building yet. Announcing the release of a landmark XML specification, W3C XML Schema, Berners-Lee explained that the three specifications -- XML 1.0, XML Namespaces, and XML Schema -- formed the new foundation of XML. XML Schema allows the description, in XML, of XML languages, such as SVG or XHTML, and it's designed to replace DTDs, which served the same purpose in XML 1.0. The development of the XML Schema specification has been characterized by controversy and criticism, since the early concerns in late 1999 as to whether Microsoft would support it. Berners-Lee praised the Schema working group for their perseverance in difficult circumstances. Though many in the XML developer community still have reservations about the specification, most agree that XML Schema will, indeed, has to succeed. So now, over three years since the XML 1.0 Recommendation was first published, the W3C has built a foundation for XML that its member companies think can be used in today's applications. However, there's more to the total XML architecture than the foundation. Berners-Lee noted that a key technology, the XML Query language, a kind of SQL for XML data, is still in development, as are XLink and XPointer, XML technologies for linking documents together..."

  • [May 11, 2001] Can XML Help Write the Law?" By Alan Kotok. From XML.com. May 09, 2001. ['A report from the Conference on Congressional Organizations' Application of XML, where both the mechanics and the public benefits of making legislation available in XML were discussed.'] "XML has spawned a number of new initiatives to improve the way enterprises, including government and not-for-profit organizations, do business. A meeting held on 24 April 2001 on Capitol Hill in Washington, D,C focused on applying XML to the process of crafting legislation, with the potential at least of transforming the basic relationship between citizens and their elected representatives. The meeting, organized by LegalXML and the House Committee on Administration, had speakers on the current ways of generating legislative documents and turning them full-fledged laws and regulations. However, the meeting also discussed ways that the public and political process could benefit from the wealth of data in government databases, when linked to legislation made available in XML documents. The few uses of XML in legislation so far have shown some impressive results. Brian Breneman of the Breneman Group, talked about the State of Michigan's experiences applying XML to its legislative documents. Breneman served as the contractor that developed the Michigan system. In Michigan, the state legislature converts its compiled law to XML, which makes it easier to offer the documents online in HTML and PDF formats..." See the COAX invitation letter, and other references in the events page.

  • [May 11, 2001] "ICE Keeps Data Fresh. Protocol for content exchange catches on slowly." By Chuck Moozakis. In InternetWeek (May 07, 2001), pages 17-18. ['ICE addresses the thorny issue of how a content provider manages the flow of information sent to users -- ensuring that the freshest information is sent to the correct audience at the right time.'] "German software maker Intershop was looking to pump up its Enfinity catalog content e-commerce application back in 1999, and was seeking a technology that would ensure that the right data was being pushed to the right audience. [CTO] Bassiri could have assigned Intershop's programmers the arduous task of writing code to support the management of supplier data. Instead, he was able to avoid that task by building Enfinity around the Information and Content Exchange protocol, a standard developed to help content providers direct information to a wide variety of users. 'No other standard addresses this area directly,' Bassiri said about ICE, an XML-based protocol developed in late 1998 to help companies route their content to disparate audiences. 'Without it we would have to write our own code, and that code would only be specific to a certain type of content.' In a nutshell, ICE addresses the thorny issue of how a content provider manages the flow of information sent to users--ensuring that the freshest information is sent to the correct audience at the right time. The standard also lets companies code their content so that it's sent to user sites during times when bandwidth is most prevalent -- for example, in the middle of the night--to avoid backbone bottlenecks. Dianne Kennedy, founder of consultancy XMLXperts, describes ICE as the 'data pump used to make sure content is where it needs to be when it needs to be there.' The protocol's great value is based on three primary attributes, according to ICE proponents. The first is that it lets users tag ICE-encoded information with effective dates and expiration dates. This means that content can be sent to users early and marked with the date on which it can be redistributed to users' customers. Similarly, ICE lets information be marked as "valid" only up until a specified expiration date. A second ICE attribute is that it lets users integrate their own syndicated content with a customer's existing information. Tribune Media Services, for example, is evaluating ICE to permit the syndicator of newspaper content to mesh its cartoon and entertainment information with news packages created by its member newspapers. A third important attribute of ICE, supporters say, is that the protocol supports a wide variety of delivery guarantees, assuring syndicators that content was delivered as promised. In this case, a company could be notified if critical content it's providing hasn't been delivered. For less time-sensitive information, however, that same company might choose not to be notified if its content has been delayed... Despite all of ICE's potential benefits, backers conceded adoption has been slower than many would have liked..." In the same article: "ICE Explained". For references, see "Information and Content Exchange (ICE) Protocol."

  • [May 11, 2001] "Breaking New Ground In Metro Interconnection." By Rebecca Wetzel. In Interactive Week (May 01, 2001). "It's finally getting easier to interconnect carriers and feed content into local and backbone pipes within metropolitan areas. Last week, MediaCenters, a Chantilly, Va., start-up, announced a set of services designed to solve what is becoming a metropolitan service interconnection crisis. Jim Greenberg, the company's chief technology officer and co-founder, says that the fact that content and applications are moving away from the backbone is driving the need for better, faster, easier and cheaper 'meet me' options at the edge of the network. The types of companies that need to meet in such metropolitan exchanges include long-haul service providers, local access providers of all stripes, content and application hosters, and content accelerators - such as Akamai Technologies. Analyst Peter Sevcik says these 'four horsemen of the Internet' currently require about 1,500 interconnections per metropolitan center... MediaCenters' networks have two components, a physical network called eXpressNet, and an eXtensible Markup Language (XML)-based operations support system called eXchangeNet. The network and OSS provide a carrier-neutral, optical, any-to-any, metropolitan network, enabling terabit-per-second interconnectivity among partners. The resulting service allows companies to instantly interconnect once they are physically hooked into the same eXpressNet network. MediaCenters also touts what it calls its 'e-bonding' tool, which allows service providers to link back-office systems so they can jointly deliver and bill for services. In addition, the XML format allow service providers to submit and receive order requests and trouble-ticket information between each other's systems. Andy Baer, MediaCenters' chief information officer, masterminded the eXchangeNet OSS. As he explains it, eXchangeNet provides service creation, assurance and billing..."

  • [May 11, 2001] "How Web Services Mean Business." By Whit Andrews, Daryl Plummer, and David Smith. From Gartner. 9-May-2001. ['Whether as a tool or a goal, Web services are poised to have a dramatic effect on business -- even enterprises that think of themselves as independent of technology trends.'] "Business must be forgiven its profound skepticism when the boosters of IT trumpet the benefits of any given innovation, but avoid acknowledging the inevitable organizational and technical challenges that technology brings as baggage. 'The Next Big Thing' is now a term of derision as often as it is a promise of innovation. But this understandable attitude has also had a surprising side effect. This time, a next big thing -- the Web services revolution in the continuum of technology evolution -- has, at its heart, the realistic possibility that it will bring fewer challenges than any previous generation. Simplicity is both the Web services concept's promise and strength. Businesses that ignore its potential, or decide to sit out its early stages, will find themselves outpaced by rivals that take advantage of Web services to improve their agility and even to transform themselves into new kinds of enterprises. Because of their inherent ease of use, dynamism and flexibility, Web services will permeate business from the executive suite to the IS 'clean room.' Enterprises of all sizes will find that Web services offer a more cost-effective way to perform agilely on the Supranet and in the other environments... Web services are software components that interact with one another dynamically and use standard Internet technologies, making it possible to build bridges between systems that otherwise would require extensive development efforts. One of the tenets of Web services is that systems can advertise the presence of business processes, information or tasks that can be consumed by other systems. Web services can be delivered to any customer device -- e.g., cell phone, (PDA) and PC -- and can be created or transformed from existing applications. More important, Web services use repositories of services that can be searched to locate the desired function to create a dynamic value chain. New specifications -- such as the Universal Description, Discovery and Integration specification -- allow the extension of business interaction by locating new processes or information, examining the description of what those processes do and binding to the new processes while the system runs. Bottom line: Web services will serve as an attractive means through which enterprises can gain access to software and business services. Through 2H02, 75 percent of enterprises with greater than $100 million in revenue will interface periodically with Web services (0.8 probability). Through 1H03, 50 percent of enterprises with less than $100 million in revenue will interface periodically with Web services (0.8 probability). This 'next big thing' will fulfill on many of the broken promises of the past and present a compelling opportunity for enterprises of all sizes."

  • [May 11, 2001] "IBM Set to Launch Major Web Services Initiative." By Jaikumar Vijayan. In ComputerWorld (May 03, 2001). "IBM on Monday [2001-05-14] plans to launch an e-business initiative aimed at helping users dynamically connect multiple enterprise applications and systems using a standards-based Web services architecture, according to sources familiar with the announcement. The effort is said to encompass all four of IBM's major software technologies below the operating system level -- its WebSphere application server and DB2 database, plus subsidiary Tivoli Systems Inc.'s management tools and the groupware and collaboration products made by the company's Lotus Development Corp. unit. As part of the initiative, the sources said, IBM will develop new tools and software components that are supposed to let the different technologies interact with one another more efficiently. IBM declined to comment on the announcement, which is due to take place at an event in New York. Among the products expected to be announced are WebSphere Studio tools for developing Web-based computing services and a WebSphere Business Integrator, which reportedly will provide integration, transaction and workflow services between different internal applications and between systems running at multiple companies. The new WebSphere products are scheduled to start shipping later this quarter and will incorporate support for standards such as the Simple Object Access Protocol [SOAP]; the Universal Description, Discovery and Integration [UDDI] directory; and the Web Services Description Language [WSDL], the sources said. Also in the works is a Lotus Web services enablement kit supporting that unit's software products, they added. Those tools are expected to become available in the second half of this year and will include a knowledge-discovery management module that can be used to capture information about various Web services. Meanwhile, the sources said a DB2 XML Extender is being added to bring Web services to IBM's relational database. The technology has already been integrated into IBM's recently announced DB2 Version 7.2 release and will enable applications built on top of that software to access information stored in databases made by other vendors..."

  • [May 10, 2001] "Unicode Character Database (UCD) in XML Format." Prepared by Mark Davis. From the posting to 'unicode@unicode.org' 2001-05-10, 'Subject: UCD in XML': "Several people asked me over the last month about the XML version of the Unicode character database that I presented at last November's UTC meeting. I posted it at http://www.macchiato.com/utc/UCD.zip, containing two files: UCD.xml and UCD-Notes.htm. Caveats: (1) I regenerated the data with Unicode 3.1 data. However, (a) I haven't done more than spot-check the results, and (b) the format differs somewhat from what is documented in the notes; (2) I still have to comment out characters FFF9..FFFD, and all surrogates, so that people can read the file with Internet Explorer (I do wish they would use a conformant XML parser). Also, note that IE takes quite a while to load the file... Format: The Unicode blocks are provided as a list of <block .../> elements, with attributes providing the start, end, and name. Each assigned code point is a <e .../> element, with attributes supplying specific properties. The meaning of the attributes is specified below. There is one exception: large ranges of code points  for characters such as Hangul Syllables are abbreviated by indicating the start and end of the range. Because of the volume of data, the attribute names are abbreviated. A key explains the abbreviations, and relates them to the fields and values of the original UCD semicolon-delimited files. With few exceptions, the values in the XML are directly copied from data in the original UCD semicolon-delimited files. Those exceptions are described below... Numeric character references (NCRs) are used to encode the Unicode code points. Some Unicode code points cannot be transmitted in XML, even as NCRs (see http://www.w3.org/TR/REC-xml#charsets), or would not be visibly distinct (TAB, CR, LF) in the data. Such code points are represented by '#xX;', where X is a hex number. Attribute Abbreviations: To reduce the size of the document, the following attribute abbreviations are used. If an attribute is missing, that means it gets a default value. The defaults are listed in parentheses below. If there is no specific default, then a missing attribute should be read as N/A (not applicable). A default with '=' means the default is the value of another other field (recursively!). Thus if the titlecase attribute is missing, then the value is the same as the uppercase. If that in turn is missing, then the value is the same as the code point itself. For a description of the source files, see UnicodeCharacterDatabase.html. That file also has links to the descriptions of the fields within the files. Since the PropList values are so long, they will probably also be abbreviated in the future." See "XML and Unicode." [cache]

  • [May 10, 2001] "Summary of the XML Family of W3C Languages." By Airi Salminen [Email: asalminen@db.uwaterloo.ca. 28-March-2001. Latest version URL: http://www.cs.jyu.fi/~airi/xmlfamily.html. "XML is a markup language for presenting information as structured documents. The language has been developed from SGML as an activity of the World Wide Web Consortium (W3C). Within W3C there is going on a number of other XML-related language development activities where the intent is to specify syntactic and semantic rules either for some specific kind of XML data or for data to be used together with XML data for a specific purpose. In this report the term XML family of W3C languages refers to XML and those XML-related languages. The purpose of the report is to give a concise overview of the current state of the development of the languages... In this summary the XML family of W3C languages has been divided into four groups: XML, XML Accessories, XML Transducers, and XML Applications. (1) XML Accessories are languages which are intended for wide use to extend the capabilities specified in XML. Examples of XML accessories are the XML Schema language extending the definition capability of XML DTDs and the XML Names extending the naming mechanism to allow in a single XML document element and attribute names that are defined for and used by multiple software modules. (2) XML Transduces are languages which are intended for transducing some input XML data into some output form. Examples of XML transducers are the style sheet languages CSS2 and XSL intended to produce an external presentation from some XML data and XSLT intended for transforming XML documents into other XML documents. A transducer language is associated with some kind of processing model which defines the way output is derived from input. XML Applications are languages which define constraints for a class of XML data for some special application area, often by means of a DTD. Examples of XML applications are MathML defined for mathematical data or SMIL intended for multimedia documents... This report has been created as part of the X Group activities at the University of Waterloo in Canada." [cache]

  • [May 10, 2001] "Updating XML." By Igor Tatarinov, Zachary G. Ives, Alon Y. Halevy, and Daniel S. Weld. Paper presented at ACM SIGMOD/PODS 2001, Santa Barbara, California, May 21-24, 2001. 12 pages. The authors propose a set of operations for both ordered and unordered XML data, and describe extensions to the proposed W3C XML Query language (XQuery) to incorporate the update operations. They conclude that updates to an XML document can be expressed in a concise and natural way, even with support for ordering. They show that the basic set of constructs can be efficiently implemented over a relational database. Note: Zach Ives and Igor Tatarinov work on the the Tukwila data integration system. "Zack Ives is responsible for the Tukwila execution engine and its adaptive operation, as well as the optimizer for previewing query results. His work largely relates to adaptive query processing, processing of XML data, XML and zero-knowledge query optimization, and XML query languages. Igor Tatarinov is developing the next-generation Tukwila query optimizer, focusing on high-level optimization for data integration." For related XML research, see (1) the publications listing of Zachary Ives and Daniel Weld (Professor of Computer Science and Engineering, University of Washington); (2) "Tukwila Data Integration System (University of Washington)." [cache]

  • [May 09, 2001] "Model-Driven Architecture: Vision, Standards And Emerging Technologies." By John Poole (Hyperion Solutions Corp). April 2001. ['A paper submitted to ECOOP 2001 Workshop on Metamodeling and Adaptive Object Models. It discusses the MDA standards (including CWM), current Java platform initiatives, and they could ultimately be used to build totally dynamic systems.'] "Recently, the Object Management Group introduced the Model-Driven Architecture (MDA) initiative as an approach to system-specification and interoperability based on the use of formal models. In MDA, platform-independent models (PIMs) are initially expressed in a platform-independent modeling language, such as UML. The platform-independent model is subsequently translated to a platform-specific model (PSM) by mapping the PIM to some implementation language or platform (e.g., Java) using formal rules. At the core of the MDA concept are a number of important OMG standards: The Unified Modeling Language (UML), Meta Object Facility (MOF), XML Metadata Interchange (XMI), and the Common Warehouse Metamodel (CWM). These standards define the core infrastructure of the MDA, and have greatly contributed to the current state-of-the-art of systems modeling. As an OMG process, the MDA represents a major evolutionary step in the way the OMG defines interoperability standards. For a very long time, interoperability had been based largely on CORBA standards and services. Heterogeneous software systems inter-operate at the level of standard component interfaces. The MDA process, on the other hand, places formal system models at the core of the interoperability problem. What is most significant about this approach is the independence of the system specification from the implementation technology or platform. The system definition exists independently of any implementation model and has formal mappings to many possible platform infrastructures (e.g., Java, XML, SOAP). The MDA has significant implications for the disciplines of Metamodeling and Adaptive Object Models (AOMs). Metamodeling is the primary activity in the specification, or modeling, of metadata. Interoperability in heterogeneous environments is ultimately achieved via shared metadata and the overall strategy for sharing and understanding metadata consists of the automated development, publishing, management, and interpretation of models. AOM technology provides dynamic system behavior based on run-time interpretation of such models. Architectures based on AOMs are highly interoperable, easily extended at run-time, and completely dynamic in terms of their overall behavioral specifications (i.e., their range of behavior is not bound by hard-coded logic). The core standards of the MDA (UML, MOF, XMI, CWM) form the basis for building coherent schemes for authoring, publishing, and managing models within a model-driven architecture. There is also a highly complementary trend currently building within the industry toward the realization of these MDA standards in the Java platform (i.e., standard mappings of platform-independent models to platform-dependent models, where the platform-dependent model is the Java platform). This is a sensible implementation strategy, since development and integration is greatly facilitated through common platform services and programming models (interfaces or APIs), provided as part of the Java platform. Java 2 Platform, Enterprise Edition (J2EE), has become a leading industry standard for implementing and deploying component-based, distributed applications in multi-tier, Web-centric environments. Current efforts within the Java Community Process to develop pure Java programming models realizing OMG standards in the form of J2EE standard APIs (i.e., JMI, JOLAP and JDMAPI) further enhance the metadata-based interoperability of distributed applications. This paper surveys the core OMG MDA standards (i.e., UML, MOF, XMI and CWM) and discusses the current attempts at mapping these standards to J2EE, as examples of PIM-to-PSM translations that are currently under development. These forthcoming APIs will provide the initial building blocks for a new generation of systems based on the model-driven architecture concept. The progression of these initial MDA realizations to AOMs is the next logical step in this evolution." See: "OMG Model Driven Architecture (MDA)." [cache]

  • [May 09, 2001] "Data Warehousing Industry Weaves a Meta Data Standard. [Business Intelligence.]" By David Marco. In Application Development Trends Volume 8, Number 5 (May 2001), page 17. "The issue of meta data integration is one of the chief mitigating factors that have prevented most organizations from achieving successful data warehouse, e-business, Customer Relationship Management (CRM) and Enterprise Resource Planning (ERP) implementations. This column focuses on the Object Management Group (OMG) meta model standard Common Warehouse Metamodel (CWM), the impact this standard will have on the industry and its promise to aid in this task of meta data integration... the OMG CWM is a standard that offers the promise of improving these meta data integration processes. But what is a meta model? It is a fancy phrase for a physical data model that stores meta data. The CWM has initially focused on the data warehousing arena and is broadly supported by the vast majority of data warehouse vendors, meaning that they have integrated CWM into their tools' meta model or they are looking to provide an interface that will transfer their meta data into CWM. This capability will allow data warehousing products from different vendors to share technical meta data. The CWM specification can be downloaded from www.omg.org. For many years all of us in the meta data arena have desired a global meta model standard. A year ago we had two competing standards, CWM and the Open Information Model (OIM), which was being moved forward by the Meta Data Coalition (MDC). Unfortunately for the industry, two standards were one too many. On September 25, 2000 the MDC merged with the OMG with the goal of consolidating the separate initiatives into one meta data standard under which all vendors can unify..." [From the OMG web site, 'Data Warehousing, CWM And MOF Resource Page': "The Common Warehouse Metamodel (CWM) is a specification that describes metadata interchange among data warehousing, business intelligence, knowledge management and portal technologies. The OMG Meta-Object Facility (MOF) bridges the gap between dissimilar meta-models by providing a common basis for meta-models. If two different meta-models are both MOF-conformant, then models based on them can reside in the same repository."] See: (1) "OMG Common Warehouse Metadata Interchange (CWMI) Specification", and (2) now absorbed/merged, "MDC Open Information Model (OIM)."

  • [May 09, 2001] "EJBs to the Rescue. [EJB Update.]" By Peter Fischer (Quantum Enterprise Solutions Inc.) and Stephen Reckford (Concept Five Technologies). In Application Development Trends Volume 8, Number 5 (May 2001), pages 29-37. [' As corporate IT's integration activities continue to accelerate and consume an increasingly large piece of the budgetary pie, EJBs can offer a more rapid component-based integration solution for the J2EE environment.'] "According to Forrester Research, 30 to 40% of corporate IT budgets are typically spent on integration activities. According to a GartnerGroup estimate, by 2005 e-business initiatives and infrastructure will consume 30 to 50% of enterprise IT spending. Based on these forecasts, there will continue to be strong budget and financial incentives to leverage successful integration strategies... The J2EE platform is a solid platform upon which component-based integration solutions targeted to e-business can be built. Java technologies fit into the J2EE platform and provide a platform for creating apps that combine elements in the client tier with applications in the EIS tier via a middle tier that is comprised of presentation and business logic entities... Combining point-to-point integration solutions with other middleware technologies, such as message or integration brokers, opens up new horizons in integration and provides a robust, scalable and extensible integration platform that provides message-based integration among multiple application systems. The power of this approach lies in replacing a potentially chaotic and disorganized set of point-to-point integrations with a coordinated set of interoperable connections which can be reused and serve multiple purposes. In this integration architecture, the integration broker provides a single interface for accessing legacy, CRM or ERP assets, replacing the point integration solutions such as JDBC, ECI, JMS and MQSeries with a single API set. Adapters provided by the integration broker vendor allow these systems to plug into the integration architecture and EJB components transparently... A significant advantage of this approach is the ability to leverage XML as a standard message format. A number of products available on the market today can be used to create Java classes from XML constructs, which eliminates the need to write Java code that utilizes XML parsers and the Document Object Model (DOM). One such tool is the Breeze XML Studio from The Breeze Factor LLC in Encinitas, Calif., which allows developers to create a JavaBeans class graph that encapsulates XML parsing and validation and has methods that map directly to the XML data elements and attributes. At runtime, these JavaBeans are populated with the appropriate data from the XML document. The beans can then be packaged as an EJB, called EJB A, enabling the XML information to become an integral component of an integration architecture. B2B integration brokers provide the ability to integrate processes and information external to an organization. They provide connectivity between supply chain partners, customers and exchanges with application components via data exchange using XML messaging (XML/HTTPS). Combining the capabilities of these brokers with a tool like Breeze allows the creation of integration components that can become an integral part of the B2B integration architecture... One particular pattern that we have implemented successfully in component integration frameworks is an extension of the classic Model-View-Controller (MVC) pattern. The MVC design pattern divides an interactive application into three discrete functional components. The model component contains the core functionality and data; the view component provides information to the user; and the controller component ties the model and view together to create the new transaction or process. In this approach the interactive application is the transaction or business process initiator, which can be a servlet, a Java Server Page or another presentation layer component. Multiple resource adapters (one resource adapter per type of EIS) can be plugged into an application server. This capability enables EJB and other J2EE components that are deployed in the application server to access the underlying EISs. The resource adapter is used by an application server or client to connect to an EIS. The resource adapter 'plugs into' the application server and collaborates with it via a set of standard interfaces to provide underlying security and transaction support in a manner similar to EJB containers. Several ERP vendors are getting a jump on the new architecture by releasing connectors that are compatible with popular application servers. Two examples are PeopleSoft and SAP. By using PeopleSoft's Component Interfaces, third-party systems can synchronously invoke PeopleSoft business logic via EJB. SAP validates third-party products for SAP's Business Technology that support development of business logic in Java and data transfer using XML..." Also in this issue of ADTMag: "Designing a scalable dot.com architecture using J2EE."

  • [May 09, 2001] "Integration: This Decade's Theme." By John D. Williams. In Application Development Trends Volume 8, Number 5 (May 2001), pages 63-64. "... Now that we are at the beginning of a new decade, I believe that its theme will be integration. The integration issue is at the heart of the way companies choose to do business, and it is a defining characteristic of e-business. This is why the deployment of technologies such as Enterprise Application Integration (EAI) is really a business issue and not a technology issue. The forces driving business integration are rapidly changing markets, new business opportunities and customer expectations. These forces of change are working their way into the IT organization, driving budgets and projects. An aspect of these forces at work in the growth of EAI spending. In 1999, IT organizations spent $500 million on EAI tools. In 2000, they spent $900 million. Analysts predict that by 2005, companies will spend $7.3 billion on EAI... I think it is helpful to use a framework to understand different needs in EAI and how different tools meet those needs. Imagine a four-layer framework describing EAI capabilities. The lowest layer is the Transportation layer, which has five communication models describing how one system communicates with another. These models, as GartnerGroup defines them, are Conversational; Request/Reply; Message Passing; Message Queuing; and Publish and Subscribe. The next layer up is Data Transformation. This layer describes the mechanisms for taking information from one database and transforming it before putting it into another database. The third layer is Business Rules. The Business Rules layer describes the method of taking select information from one system and transforming it into use by others. This may be a one-to-one transformation or a one-to-many transformation. The top layer is the Business Process layer, which coordinates the flow of information throughout a complete business process. It describes the workflow and transformation of information across multiple systems. In our framework, we also see that meta data lies across all these layers. As we move from the lower levels of our framework to the top, we typically see the business value of integration increase. Most middleware tools, such as those for messaging and data warehousing, provide capabilities for the lowest two layers: Transportation and Data Transformation. Most EAI-specific tools are focused on the Business Rules layer, while some venture into the Business Process layer. EAI tools also often integrate with or provide support for tools working in the lower layers... XML has sparked much interest in this area of meta data exchange. It provides a mechanism for the dynamic interchange of meaningful information. In particular, there are components of XML that have tremendous value in the support of system integration. The most useful components are DTDs, XML Schema (and related variations), XSLT and XMI. DTDs and XML Schema define the structure of a document or information interchange. DTDs are the standard today. They do not use XML syntax and have some important limitations. For example, DTDs do not support the automatic validation of values. On the other hand, XML Schema does support the automatic validation of values. It also has the ability to define recurring blocks of elements or attributes once. Unfortunately, it is not yet a standard, though that should change soon. There are other non-standard alternatives available. XSLT allows you to transform one document type into another. XMI is the XML Meta data Interchange format from the Object Management Group (OMG). XMI is currently a proposal for the open interchange of application components and assets. But do we really have all the tools we need to develop robust e-business systems? I don't think so. I've mentioned that EAI breaks down the stovepipes when you integrate systems. This can have serious implications. Higher levels of integration can lead to higher levels of unintended interaction..."

  • [May 09, 2001] "Borland Enters Web Services Fray." By Tom Sullivan. In Infoworld (May 07, 2001). "When Borland announces the latest incarnation of its Delphi RAD (rapid application development) environment this week, the company will focus on the toolkit's tighter integration with its Kylix Linux tools and on the product's cross-platform interoperability. But Delphi's more important enhancements clearly are its support for Web services standards. Delphi 6.0 will feature compiler-level support for SOAP (Simple Object Access Protocol) and WSDL (Web Services Description Language). That means programmers will be able to Web-enable their applications without writing extra code, according to Michael Swindell, director of product management at Borland, in Scotts Valley, California. 'Delphi programmers don't have to do anything differently; they just select to expose the code from a menu,' Swindell said. The software itself adds the SOAP and WSDL functionality. Also new to Delphi 6.0 are BizSnap, a Web-services platform for building and integrating components; WebSnap, a Web application design tool; and DataSnap, a tool for creating Web-enabled database middleware. Borland is not the only tools vendor looking to help developers build Web services. WebGain, in Santa Clara, Calif., is also equipping its toolbox for Web services, and is componentizing the development process in preparation for Web services, according to CTO Ted Farrell. Analysts expect other tools vendors, such as Merant and Rational, to release products designed to help developers build Web services. All the major Web services vendors, including Microsoft, IBM, Sun Microsystems, and Oracle, have tools in various stages of development as well. 'We're at the point right now where people are starting to build Web services,' said Rikki Kirzner, an analyst at Framingham, Mass.-based IDC. 'They're not really building mission-critical apps, but they are making things such as e-business applications that use the same functions over and over.' One such customer, Hewitt Associates, a Lincolnshire, Ill.-based management consulting firm specializing in human resource solutions, is using IBM's WebSphere application server and the WebSphere suite of tools to move toward Web services..." See also (1) the Borland announcement, and (2) "Simple Object Access Protocol (SOAP)."

  • [May 09, 2001] "Borland Aims to Make Web Services a 'Snap'." By Peter Coffee. In eWEEK (April 26, 2001). "Within the next few weeks, Borland hopes to dilute the dominance of Microsoft's mind share in setting the course for Web services. While Microsoft's .Net development tools slog through a surprisingly volatile beta program, Borland will unveil in May a services-oriented tool kit that could unleash a surge of standards-based Web application deployment. I found the foundation-level DataSnap framework a logical next step along the XML-based data-handling path blazed by Borland's Linux-hosted Kylix tool set, launched earlier this year. Developers using DataSnap APIs will be able to publish any broadly supported SQL relational database via XML syntax, manipulated by SOAP (Simple Object Access Protocol) messages. Developers will be able to offer database access to a wide range of clients, including thin client browsers and 'headless' Web services, without costly development and maintenance of parallel code bases. Most strategic for Borland is the top-level BizSnap framework that fully integrates Web services into an object-oriented development environment. Anticipating the rapid adoption of XML by many enterprise developers, outpacing the emergence of standard XML schema, BizSnap tools streamline the definition of modular XML transforms that let a single application interact with XML data streams of similar content but differing structure. I found the transform creation tools intuitive and powerful. Meanwhile, SOAP bindings to Borland's integrated development environment will help developers follow the learning curve by automating syntax checking and offering intelligent auto-completion of XML-manipulating expressions, just as for conventional application code..." [Website description: "Delphi 6 radically simplifies building next-generation eBusiness applications on the Internet with complete SOAP based Web Services and XML data exchange support. The seamless integration of XML and Web Services technologies with Delphi 6 delivers the only Rapid Application Development for industry standard Web Services and B2B, B2C, and P2P integration over the Internet... DataSnap delivers high-performance, Web Service-enabled database middleware that enables any client application or service to easily connect with any major database over the Internet DataSnap supports all major database servers such as Oracle, MS-SQL Server, Informix, IBM DB2, Sybase and InterBase. Client applications connect to high-performance DataSnap servers through industry standard SOAP/XML HTTP connections over the Internet without bulky database client drivers and complex configuration requirements. DCOM, CORBA, and TCP/IP connections are also supported... Connect any Delphi 6 application or Web Service with Borland AppServer/EJBs using new SIDL (Simple IDL). Easily build ultra high-performance rich GUI Windows clients for EJB based AppServer applications. Publish AppServer EJB functionality to the world over Internet as industry standard SOAP/XML Web Services."] See also (1) the Borland announcement, and (2) "Simple Object Access Protocol (SOAP)."

  • [May 08, 2001] "ebXML and the Road to Universal Standards." By Dave Carr. In InternetWorld (May 08, 2001). "Another chapter in the quest for universal electronic business standards will end this week in Vienna, Austria, where the ebXML organization is meeting to wrap up its 18-month project. It's still probably an early chapter, with major plot twists yet to come, but it does move the story forward. One encouraging sign: the ripple effect of independently developed XML specifications' being reconciled. The ebXML group recently agreed to incorporate the SOAP protocol, which is popular with many XML Web services enthusiasts, into the ebXML messaging specification, and the version that's going up for a vote this week is based on an extended version of SOAP. Those extensions, in turn, could wind up being incorporated into the World Wide Web Consortium's XML Protocol (XP) effort, which is supposed to produce the successor to SOAP. Previously, this had been shaping up as a typical industry battle, with Sun Microsystems favoring ebXML and Microsoft talking up SOAP (the Simple Object Access Protocol), which is fundamental to its .Net initiative. Then RosettaNet, the group behind the electronics industry's highly advanced electronic commerce standards, said it would incorporate ebXML messaging into the next revision of its standards, rather than continuing to develop its own messaging specification. Carry this convergence forward another couple of steps, and we should see agreement on messaging standards among Microsoft's BizTalk, RosettaNet, ebXML, and other electronic-commerce frameworks. On the other hand, messaging is just one component of ebXML, and there are other areas in which it still needs to be reconciled with competing initiatives. For example, there's overlap between the ebXML registry and repository specifications and UDDI (Universal Description, Discovery, and Integration), another widely supported Web services technology. The reason the ebXML organization was formed in the first place was to bring together electronic-commerce initiatives from OASIS, an industry consortium, and UN/CEFACT, the United Nations organization that created the international standards for Electronic Data Interchange (EDI). CEFACT, the Centre for Trade Facilitation and Electronic Business, was looking at addressing the demand for a modernized version of EDI that would use XML and the Internet while OASIS was trying to solve essentially the same problems from an XML-centric worldview... What ebXML tries to do is establish a baseline framework that can be used to solve problems that may cross industry boundaries. It also aspires to promote a generalized model that those vertical industry groups can build on. To implement ebXML, you're supposed to model your business processes using the Unified Modeling Language, a standard supported by object modeling tools such as Rational Rose, and the UN/CEFACT Modeling Methodology. Business partners exchange Collaboration Protocol Profiles and use them to forge Collaborative Protocol Agreements (an extension of IBM's Trading Partner Agreement specifications). Further, it tries to specify some common business processes for interaction that can be used within this scheme..." See: (1) "Electronic Business XML Initiative (ebXML)", and (2) the announcement, "UN/CEFACT and OASIS Meeting Showcases ebXML for Healthcare and B2B."

  • [May 08, 2001] "XML Group to Create Specifications For Voting Systems." By Todd R. Weiss. In ComputerWorld (May 04, 2001). "Six months after the tumultuous presidential balloting in Florida, a nonprofit technical consortium yesterday announced that it has formed a committee to develop a specialized XML standard aimed at improving the accuracy and efficiency of elections. The Billerica, Mass.-based Organization for the Advancement of Structured Information Standards (OASIS) said the new technical committee will work to develop an Election Markup Language (EML) based on XML technology. The EML proposal would include specifications for exchanging data between election and voter registration systems developed by different hardware, software and IT services vendors. Karl Best, director of technical operations for OASIS, said last November's voting brouhaha in Florida graphically showed the need for more accurate elections using modern technology. The improvements envisioned by OASIS could impact public and even private elections around the world, including those held by private groups and companies, he said. The EML committee will look at a wide range of possible implementations for the new specifications, including voter registration, change of address tracking, redistricting, requests for absentee ballots, polling place management, election notification, ballot delivery and tabulation and reporting of election results. While OASIS will only create the specifications and leave it up to technology vendors to implement them, Best said he's confident that the international consortium's standing in the XML world would encourage the adoption of EML by a wide range of companies that offer voting systems and software. Gregg McGilvray, chairman of the new Election and Voter Services Technical Committee within OASIS, said the EML standard will be applicable to far more than just Web-based voting systems. He envisions the standard allowing different platforms, including touchscreen voting machines and even telephone-based systems, to share data regardless of how the information is collected or what operating system is being used. But Steve Weissman, legislative representative for the Washington-based watchdog group Public Citizen, said it's too early to support the EML effort or any other specific ideas for how to improve elections..." See (1) "Election Markup Language (EML)", and (2) "XML and Voting (Ballots, Elections, Polls)."

  • [May 08, 2001] "The Electrified Supply Chain." By Rajeev Kasturi. In Intelligent ERP (May 03, 2001). ['RosettaNet is delivering on the promise of extensible B2B integration.] "RosettaNet, a self-funded, nonprofit consortium of over 250 IT, EC, and SM businesses, has been working since 1998 to establish and implement industrywide standards for e-business. Trading partners adopting RosettaNet standards will benefit from a common language and communication protocols based on Internet and XML technologies. Using the standards also will result in reduced transaction turnaround times, greater transparency in translation and integration with backend systems, reduced costs, and increased efficiency. RosettaNet wants be the 'lingua franca of e-business... RosettaNet standards address four aspects of transactions between trading partners: business processes, data elements, communication protocols, and product/partner codification. In a nutshell, these four components encapsulate the exchange of information among trading partners. RosettaNet's Partner Interface Processes (PIPs) are elements that define business processes among supply-chain partners, such as pricing and availability requests, purchase orders, and order acknowledgements. PIPs are system-to-system, XML-based dialogs carried out based on certain specifications and guidelines. PIPs lie at the bottom of a hierarchy headed by clusters and segments. Clusters represent fundamental business process groups. Clusters are further broken down into segments, which represent interenterprise processes involving different types of trading partners. Segments consist of PIPs that define specific processes. For example, Cluster 3 is for order management, and it includes a Segment A that pertains to quotes and order entry. This segment has seven published PIPs, including 3A1 (Request Quote), 3A2 (Request Price and Availability), and 3A3 (Transfer Shopping Cart). Each PIP comes with a message guideline and XML document type definition (DTD). Dictionaries, which define data elements, come in three flavors: Business Dictionary, IT Dictionary, and EC Dictionary. Business data entities and properties are defined in the Business Dictionary, the IT Dictionary defines IT products and properties, and the EC dictionary defines components and their properties. All these elements are mapped to codification standards such as UN/SPSC. One of the fundamental requirements for meaningful data exchange and efficient information processing for products and services is commonly accepted codification standards. Fortunately, RosettaNet supports three widely accepted codification standards. The Data Universal Numbering System (DUNS) is maintained by Dun & Bradstreet and identifies a business and its location. The Global Trade Item Number (GTIN) identifies products, and the United Nations/Standard Products and Services Code (UN/SPSC) robustly and comprehensively classifies products and services. Another fundamental requirement for meaningful data exchange and information processing is a communications protocol. The RosettaNet Implementation Framework (RNIF) adequately covers this need for communication standards. The framework defines open exchange protocols and guidelines for communications between applications on networks. These specifications encompass various requirements such as message packing and the transfer of PIP objects between Web or browser servers; they incorporate protocols such as Common Gateway Interface (CGI), HTTP, and Secure Sockets Layer (SSL). The RNIF also supports digital signatures, digital certificates, and SSL to ensure business transactions are secure..." See "RosettaNet."

  • [May 08, 2001] "Slicing the Enterprise Pie. Portal Developers Partnering With Integration Vendors to Make Software More Transaction-Oriented." By John S. McCright. In eWEEK (May 07, 2001). "Portal developers octopus Software Inc. and Data Channel Inc. are making their software more transaction-oriented through upgrades and partnerships with EAI vendors. The goal of products coming from both companies is to provide users with a front-end presentation layer with which to view and manipulate a broader slice of corporate applications. Octopus last week introduced its namesake platform for building so-called Meta Applications, which enable nontechnical business users to create customized views of data from multiple sources. The Octopus Platform uses specialized adapters and a drag-and-drop user interface to view fine-grained data in Extensible Markup Language, messaging systems, enterprise resource planning applications, and other enterprise and legacy software. The platform also gives users the ability to write business logic and rules to create dynamic relationships between data coming from various systems, said Octopus CEO Stephen Douty, in Palo Alto, Calif. In this way, Meta Applications enable users to weave together data and processes from existing applications to form new applications. Separately, DataChannel, of Bellevue, Wash., this week will introduce its DCS (Data Channel Server) Extension Kit for EAI (Enterprise Application Integration). The SDK (software development kit) lets companies tap into a broader range of enterprise applications through its DCS portal. The SDK enables DCS to integrate with any platform that supports asynchronous messaging through adapters. New adapters from SeeBeyond Technology Corp. and Vitria Technology Inc. will extend DCS to 125 more applications and databases, DataChannel officials said. Although the extension kit will be used by IT managers, DCS 5.0, due late this summer, will feature a new user interface that will allow nontechnical people to do drag-and-drop editing of their portal Web pages. Version 5.0 will also add an application server, additional EAI adapters, stronger versioning and workflow for document management, and the ability for users to have multiple virtual workspaces for collaboration, officials said. A Shared Object Repository will enhance Version 5.0's process integration capabilities, officials said..."

  • [May 08, 2001] "Enabling Access to Online Digital Services: IMS Digital Repositories Technical Specifications Group." By Kevin Riley. In Syllabus Magazine Volume 14, Number 10 (May 2001), pages 16-18. ['A look at the process of setting standards and specifications to support interoperability of digital repositories.'] The author surveys the goals of the IMS Digital Repositories Group, discusses the IMS specification process, and summarizes the key IMS specifications. The IMS Digital Repositories Work Group was established in February 2001, and scheduled its first meeting for May 7-9 in Lund, Switzerland. The article provides a table listing the seven (7) IMS specifications already published and three (3) specifications under development. "The group spans user communities, server-side technology providers, publishers, and middleware infrastructure vendors. Group members include EdNA (representing DETYA in Australia), Fretwell-Downing, GIUNTI (Italy), IOS Press (Netherlands), Oracle, Sun, TEMASEK (Singapore), UKOLN (participants in the UK Distributed Network of Electronic Resources Program), and the University of California at Berkeley and University of Wisconsin from the NSDL program. Others are coming on board as the group gets under way. The work of the group falls into two categories: (1) Integration of e-learning with existing online digital services; (2) Development of novel repository technology to support the configuration, presentation, and delivery of learning objects required for learner-centric learning to become a reality. The diversity of offerings under the umbrella of online digital services reflects a wide range of content formats, existing implemented systems, technologies, and established practice. However, given the investment made in their development,it is impractical even to consider a solution that requires their re-implementation on a short-to-medium-term timeframe. Rather, the group will focus on common functions, which can be used across services to enable them to present a common interface. These common functions encompass desirable and necessary features such as authentication, authorization, enrollment, search, location and retrieval, IPR management, user preferences, and profiling, payment, and search gateways across services. Learning Object Repositories share all of the above (either directly or via the LMS they serve), but also have the added dimension of supporting contextualized sequencing and navigation -- and potentially, dynamic branding of objects to a service at runtime. The group intends to construct a generic functional architecture and then define specific application profiles through that architecture to meet the needs of each of the services identified above. The functions will then be prioritized to identify the order in which they will be put through the IMS specification process. In addition to the specification work, linked R&D projects are being set up across Australia, Europe, and the U.S. that will support pilot implementations of the technology adopted, both as initial proof of concept and testing of the robustness of the emerging specifications." ["IMS Global Learning Consortium, Inc. (IMS) is developing and promoting open specifications for facilitating online distributed learning activities such as locating and using educational content, tracking learner progress, reporting learner performance, and exchanging student records between administrative systems. IMS has two key goals: (1) Defining the technical specifications for interoperability of applications and services in distributed learning, and (2) supporting the incorporation of the IMS specifications into products and services worldwide. IMS endeavors to promote the widespread adoption of specifications that will allow distributed learning environments and content from multiple authors to work together (in technical parlance, 'interoperate'). IMS uses XML as its current binding, and XML-Schema as its primary XML control document language. The IMS XML Bindings and the list of IMS specifications are available for download. Specifications materials include: IMS Content Packaging Specification, IMS Learning Resource Meta-data Specification, IMS Question and Test Specification, IMS Enterprise Specification, IMS Meta-data Specification, IMS Reusable Competencies Definition Information Model Specification, IMS Learner Information Package Specification, etc.] See: (1) "IMS Metadata Specification", and (2) the recent IMS announcement.

  • [May 08, 2001] "Pushing the SCORM Envelope. The Role of XML, Content Management Systems, And Dynamic Delivery in ADL-SCORM." By Jeff Larsen, Jeff Katzman, and Jeff Caton. Peer3 company white paper. December 12, 2000. 12 pages. "The Advanced Distributed Learning initiative (ADL) emerged this year as a focal point for eLearning standards. Its Shareable Content Object Reference Model (SCORM) 1.0 technical specifications gained widespread acceptance and implementation among government, commercial, and academic circles. SCORM represents the integration of all leading eLearning standards (AICC, IMS, IEEE, and soon Microsoft's LRN) to create a unified standard. SCORM seeks to enable reuse of Web-based content across multiple environments and products, as well as provide a means for individualized eLearning. The goals of ADL are laudable. By promoting a digital knowledge network based on reusable objects and individualized learning, ADL believes it can help reduce the cost of instruction by 30-60%; reduce the time of instruction by 20-40%; increase the effectiveness of instruction by 30%; increase student knowledge and performance by 10-30%; and improve organization efficiency and productivity. Further, the vision of ADL is consistent with that of many thought leaders in the eLearning and Knowledge Management industries - mainly, that true interchange of learning objects across disparate Learning Management Systems (LMS) will require adherence to accepted standards for describing learning taxonomies, course information, and course packaging. However, we believe that SCORM must address three fundamental issues before the goals of ADL can be fully realized. These issues can be posed as the following three questions: (1) Will XML be prescribed as the data format for learning content itself? (2) Will a standard methodology be specified for integrating Content Management Systems with Learning Management Systems? (3) Will dynamic delivery of content objects be supported? True reusability of learning objects requires a data format that separates content from its pre-sentation; this fundamental requirement is met by XML. Learning Management Systems (LMS) provide only part of the solution for eLearning; XML authoring, Content Management Systems (CMS), and dynamic delivery round out the technologies necessary to complete the ADL vision. As participants in the Technical Working Group for SCORM, Peer3 remains committed to supporting the ADL and the evolution of these important standards. Peer3 was the only vendor to present a commercially available eLearning solution for XML authoring, content management, and dynamic delivery at the first ADL PlugFest earlier this year. Now Peer3, in collaboration with other eLearning-oriented CMS vendors, is promoting the recognition of this distinct product category as well as changes to the SCORM that will result in open standards for XML-based eLearning content..." See (1) See: "Shareable Courseware Object Reference Model Initiative (SCORM)", and (2) Advanced Distributed Learning Initiative. [cache]

  • [May 08, 2001] "Converting from XML Schema data types to SQL data types." By Jasmin Wason. 2001-05-08 or later. [XML-DEV post: 'Here is a link to a table of possible mappings between XML Schema and SQL data types. This is based on the idea that a relational database schema has been generated from an XML Schema. The appropriate SQL data types should be used in the database so that data from conforming instance documents can be stored. The table is very much under construction and any comments or criticisms would be most welcome.'] "The XML Schema definition language is the new W3C Recommendation for describing the structure of XML documents. The specification consists of two parts, XML Schema Part 1: Structures and XML Schema Part 2: Datatypes. The rich data type and structural support of XML Schema makes it a good candidate for automatic conversion to a database schema, and the original XML Schema Requirements document specifies a type system adequate for import/export from database systems. Other features such as the ability to define default values, scoped unique values, keys and relationships can also be employed for use with relational databases. The following table describes a possible mapping between XML Schema and SQL data types. The table is still under construction. Any comments concerning its content are most welcome and should be sent to Jasmin Wason..."

  • [May 08, 2001] "XML Databases Gain Momentum." By L. Scott Tillett. In InternetWeek (May 07, 2001), pages 10-11. "As companies turn to XML as a common language for conducting intercompany business and as organizations publish more content using XML, IT shops are warming up to using specialized XML databases to manage content. When XML database developer Ipedo launches this week with a repository for XML content, it will join a host of such vendors that have emerged in recent months, including B-Bop, Ixia and X-Hive. Longtime vendor Software AG has offered a native XML database product since 1999. IT services firm ProLogic Inc. began testing Ipedo's XML Database to manage content for a Defense Department project. The project focuses on digitizing technical manuals, such as those used to repair helicopters. The manuals, called interactive electronic technical manuals (IETMs), enable repair technicians to take notebook computers instead of thick repair books with them to the hangars when they work on aircraft... Storing commonly used documents in an XML database saves having to translate documents from their native formats as they're needed. That usually requires custom JavaScript code. XML databases could also help users overcome a fundamenta