Cover Pages: Markup Languages: Theory and Practice. Volume 3, Number 1: Table of Contents

This document contains an annotated Table of Contents for Markup Languages: Theory and Practice, Volume 3, Number 1 ('Winter 2001'). Published 2002. Markup Languages: Theory and Practice is published by the MIP Press and edited by B. Tommie Usdin and C.M. Sperberg-McQueen. It is a "peer-reviewed journal devoted to research, development, and practical applications of text markup for computer processing, management, manipulation, and display. Specific areas of interest include: new syntaxes for generic markup languages; refinements to existing markup languages; theory of formal languages as applied to document markup; systems for mark-up; uses of markup for printing, hypertext, electronic display, content analysis, information reuse and repurposing, search and retrieval, and interchange; shared applications of markup languages; and techniques and methodologies for developing markup languages and applications of markup languages."

See further information on MLTP in: (1) the journal publication statement, (2) the overview in the serials document, Markup Languages: Theory & Practice; and in (3) the journal description document. Current subscription information is also available on the MIT Press web site, where updated information on forthcoming issues is provided.

Listing of contributions in MLTP issue 3/1:

Markup's Current Imbalance [Paul Caton]
More Than One DTD [Robin Cover]
The Relationship Between General and Specific DTDs: Criticizing TEI Critical Editions [David J. Birnbaum]
SGML: The Next Generation (Forecast #1) [Arnold M. Slotnik]
The Death of XML Editors -- And The Next-Generation Client [Jan Christian Herlitz]
SGML: The Next Generation (Forecast #2) [Arnold M. Slotnik]
OASIS XSLT/XPath Conformance Testing [David Marston]
SGML: An Historical Perspective [Arnold M. Slotnik]
A Simple Property Set for Contract Architectural Forms [Sam Hunting]
Path Predicate Calculus: Towards a Logic Formalism for Multimedia XML Query Languages [Peiya Liu, Amit Chakraborty, and Liang H. Hsu]
Complexity of Context-Free Grammars with Exceptions and the Inadequacy of Grammars as Models for XML and SGML [Romeo Rizzi]
Review of Building Oracle XML Applications [Lauren Wood]

[CR: 20020305]

Caton, Paul. "Markup's Current Imbalance." [COMMENTARY AND OPINION] Markup Languages: Theory & Practice 3/1 (Winter 2001) 1-13 (with 19 references). ISSN: 1099-6622, E-ISSN: 1537-2626 [MIT Press]. Author's affiliation: Project Analyst and Electronic Publications Editor, Brown University Scholarly Technology Group; Email: paul@mail.stg.brown.edu; WWW.

Abstract: "Every text realizes a communicative act, but most contemporary text encoding ignores this. Betraying its unwitting subjection to a persuasive formalist ideology, current descriptive markup offers a static, document-oriented view of texts that occludes their temporal and performative nature."

Excerpt: "One way of looking at text has shaped contemporary encoding more than any other, the so-called OHCO (ordered hierarchy of content objects) view: text as an ordered hierarchy of content objects. I'm going to discuss the reasons for OHCO's persuasiveness, point out some of its limitations, and suggest an alternative view that challenges us to encode documents with a greater appreciation of their communicative function and a broader conception of our mediating role... OHCO-1 [formula #1] markup captures mainly transactional aspects of the text. If we are dealing with a document where the overall intent is a given (like a purchase order) and only the factual details are important (part number, quantity, etc.), then transactional markup is sufficient. Encoding interactional aspects of written texts, however, demands an account of intention. As Bach and Harnish, following Grice and, later, Strawson, point out, in any communicative act 'part of the speaker's intention is that the hearer identify the very act the speaker intends to be performing, and successful communication requires fulfillment of that intention'. Encoders looking to mark up signs of the message sender's intentions could, for example, mark the sequence of speech acts, using a taxononomy like that created by Bach and Harnish. At a more detailed linguistic level, encoders could map the mood structure of clauses to reveal what Halliday refers to as the 'interpersonal function'. Register might be another linguistic marker to note. Whatever analytical perspective they choose, encoders must accept that by encoding they intervene in the transmission of the message. Indeed, they should not just accept it, but draw attention to their interpretive role and the value it adds... We needn't abandon OHCO-1 encoding in particular or SGML-based encoding in general to do these things. At one point in their revision of OHCO-1 Renear, Mylonas and Durand assert that while there are a number of possible analytical perspectives rather than a single OHCO, 'the objects that are determined by these various analytical perspectives seem to organize themselves, without exception, into hierarchies'. Although that turns out not to be strictly true, hierarchical classification is still a powerful hermeneutic tool which SGML syntax represents well. Again, the problem lies not with the form of encoding but with the narrowness of our focus. We have let the attractions and benefits of one approach lull us into complacency, and complicity with its notion of text...In the world of humanities text encoding projects Claus Huitfeld has eloquently expressed dissatisfaction with the OHCO encoding model and developed for the Wittgenstein Archives an encoding system, mecs (Multi-Element Code System), which he characterises 'in contrast to SGML, as a descriptive, bottom-up approach to text analysis, not presupposing a hierarchical structure of texts'. The bottom-up metaphor is apt. When OHCO-1 encourages encoders to see a written text as a thing, they stay above the content and only drop down to engage with the text as message to identify the occasional editorial object whose nature is not obvious from its appearance. But when encoders see the written text as a communicative act, they must participate in the act: take on the role of hearer, attend to what the text says, and identify the speaker's intentions not just from the words' semantics but also from the attitudes conveyed. Metaphorically, encoders must be down at what would be the lowest level of an OHCO tree, completely immersed in the #PCDATA, because content generates interactional encoding far more than any content object. As its practitioners well know, all encoding interprets, all encoding mediates. There is no 'pure' reading experience to sully. We don't carry messages, we reproduce them -- a very different kind of involvement. We are not neutral; by encoding a written text we become part of the communicative act it represents. I believe we can play a more active, more interesting part than we currently do..."

Related: "Markup's Current Imbalance," presented at Extreme Markup Languages 2000, Montréal, August 2000. "Broadly, markup schemes create two kinds of elements for textual content: those which are 'structural' and those which capture facts such as names, dates, etc. The theoretical justification of this approach lies in the claim of DeRose, et. al (1990) that text is an ordered hierarchy of content objects (OHCO). But OHCO concentrates on the visible part of the textual iceberg; there is a lurking danger in what is kept from view. The more we train ourselves to describe texts as inert structures, the less we train ourselves to recognize and analyze rhetorical strategies. The more we use one particular approach to markup without exploring alternatives, the greater the risk that we end up thinking we know an elephant because we can see its tail..."

Some background documents:

"Markup Systems and the Future of Scholarly Text Processing." By James H. Coombs (Brown University), Allen H. Renear (Brown University), and Steven J. DeRose (Electronic Book Technologies). In Communications of the ACM 30 (November 1987), pages 933-947. [Bibliographic Reference]
"What is Text, Really?" By Steve DeRose, David Durand, Elli Mylonas, and Allen Renear. In Journal of Computing in Higher Education 1/2 (Winter 1990), pages 3-26.
"Refining our Notion of What Text Really Is: The Problem of Overlapping Hierarchies." By Allen Renear (Brown University), Elli Mylonas (Harvard University), and David Durand (Boston University). January 6, 1993. "...The counterexamples to the different versions of the OHCO thesis also arise in actual encoding projects..." [Bibliographic Reference]
"Markup Systems in the Present." By Steven J. DeRose. Pages 119-135 in The Digital Word: Text-Based Computing in the Humanities (1993).
"What Should Markup Really Be? Applying Theories of Text to the Design of Markup Systems. By David G. Durand, Steven J. DeRose, and Elli Mylonas. Paper presented at ALLC/ACH '96 (June 25 - 29, 1996. University of Bergen, Norway). "Perspectives explain various implicit presuppositions of the simple hierarchical approach. In this paper, we use these theoretical results to examine how the basic notions of hierarchical markup should be extended to allow a more expressive and accurate approach to document markup." [Bibliographic Reference, cache]
"Theory and Metatheory in the Development of Text Encoding" [Philosophy and Electronic Publishing]. In Monist Interactive 1996.
"Renear's Target Paper" on theoretical/philosophical issues in text encoding, from Forum 'Philosophy and Electronic Publishing: Theory and Metatheory in the Development of Text Encoding', with Michael Biggs, and Claus Huitfeldt. Monist Interactive 80:3 (1997).
"Meaning and Interpretation of Markup." By C. Michael Sperberg-McQueen, Claus Huitfeldt, and Allen Renear. In Markup Languages: Theory & Practice 2/3 (Summer 2000) 215-234 (with 17 references).
"The Descriptive/Procedural Distinction is Flawed." By Allen Renear. In Markup Languages: Theory and Practice 2/4 (Fall 2001), pages 411-420. Related presentation at Extreme Markup Languages 2000, Montréal, August 2000. ["The traditional distinction between descriptive and procedural markup is flawed. It conflates questions of mood (indicative vs. imperative statements about a document) and domain (the kinds of objects named in those statements). It also fails to describe adequately the use of markup by authors rather than by later encoders. An adequate markup taxonomy must, among other things, incorporate distinctions such as those developed in contemporary 'speech-act theory'."

[CR: 20020305]

Cover, Robin. "More Than One DTD." [SQUIB] Markup Languages: Theory & Practice 3/1 (Winter 2001) 14-16. ISSN: 1099-6622, E-ISSN: 1537-2626 [MIT Press]. Author's affiliation: OASIS and ISOGEN International, LLC; Email: robin@isogen.com.

A semi-serious note written in MLTP's Squibical genre reminding authors that no style manual whatever authorizes the use of the apostrophe as a generalized orthographic tool for constructing plurals in the English language. The note offers several theories to explain why authors sometimes write DTD's for the plural absolute instead of DTDs. The form DTD's would have to be singular possessive or a contraction for "DTD is." Some people fear that the gratuitous use apostrophe will bring civilization to a cataclysmic end through "Apostrolypse" (alternately, "Apostroluge"), as explained in the online note "Infrequently Asked Questions Concerning the Proper Spelling of 'DTD' in its Plural Form."

[CR: 20020305]

Birnbaum, David J. "The Relationship Between General and Specific DTDs: Criticizing TEI Critical Editions." [ARTICLE] Markup Languages: Theory & Practice 3/1 (Winter 2001) 17-53 (with 13 references). ISSN: 1099-6622, E-ISSN: 1537-2626 [MIT Press]. Author's affiliation: Associate Professor and Chair of the Department of Slavic Languages and Literatures, University of Pittsburgh. Email: djbpitt+@pitt.edu; WWW.

Abstract: "The present study discusses the advantages and disadvantages of general vs specific DTDs at different stages in the life of an SGML document based on the example of support for textual critical editions in the TEI. These issues are related to the question of when to use elements, attribute, or data content to represent information in SGML and XML documents, and the article identifies several ways in which these decisions control both the degree of structural control and validation during authoring and the generality of the DTDs. It then offers three strategies for reconciling the need for general DTDs for some purposes and specific DTDs for others. All three strategies require no non-SGML structural validation and ultimately produce fully TEI-conformant output. The issues under consideration are relevant not only for the preparation of textual critical editions, but also for other element-vs-attribute decisions and general design issues pertaining to broad and flexible DTDs, such as those employed by the TEI."

Conclusion: "Any of the three strategies discussed above (processing a modified TEI DTD with respect to TEIform attribute values, transformation of a custom DTDs to a TEI structure, and architectural forms) provides a solution to the issues posed by a score-like edition. Specifically, these strategies all permit much greater structural control than is available in the standard TEI DTDs, rely entirely on SGML for all validation, and produce a final document that is fully TEI-conformant."

Related: An earlier version of this paper was presented at Extreme Markup Languages 2000, Montréal, August 2000; this paper is available online.

[CR: 20020305]

Slotnik, Arnold M. "SGML: The Next Generation [Forecast #1]." [SQUIB] Markup Languages: Theory & Practice 3/1 (Winter 2001) 54. ISSN: 1099-6622, E-ISSN: 1537-2626 [MIT Press].

Results of a survey asking "What do you see in the future for Standard Generalized Markup Language?"

[CR: 20020305]

Herlitz, Jan Christian. "The Death of XML Editors -- And The Next-Generation Client." [COMMENTARY AND OPINION] Markup Languages: Theory & Practice 3/1 (Winter 2001) 55-63. ISSN: 1099-6622, E-ISSN: 1537-2626 [MIT Press]. Author's affiliation: R&D Manager, Excosoft AB. Email: herlitz@excosoft.se.

Abstract: "This paper explores three questions that are of interest to software companies developing XML editors. First, do we need XML editors? It can be argued that a spreadsheet tool like Microsoft Excel will be an XML editor when it saves data in XML. The corollary of this argument is that the current crop of commercially available XML editors -- for example, Adept, XMetaL, and Documentor -- will be spreadsheet tools when they can read Excel files. Such arguments are not sensible and quickly lead to the conclusion that when XML becomes the standard storage format it will no longer be useful to talk about XML editors. The next generationof XML editors will be clients! Second, does standardization limit development of XML clients or is it a driving force? If XML clients are compelled to handle paragraphs, tables, and equations (and only these), and if styling is limited to CSS and XSL, how will different client manufacturers compete? Will there be any incentive for creativity and innovation? Vendor-specific styling is a prerequisite for market-driven development of XML clients. Third, should XML clients be free? No, customers will pay for services and they don't care whether those services are performed by a client or a server. In fact, they cannot tell."

Conclusion: "We believe XML will become the standard storage format. XML editors will die, and be replaced by clients. Innovation and creativity will be delivered through vendor specific style sheets. Clients will be sold on a pay-per-view or subscription basis."

Related: A presentation given at Markup Technologies '99. "The death of XML editors: And the birth of useful editors." - "What is an XML editor? When Excel saves its data in XML format, will it be an XML editor? When Adept, XMetaL, and Documentor can read Excel files, will they have become spreadsheet tools? My point here is that we don't really have a good name for the XML editors of today, and it will become more and more confusing as XML becomes the standard storage format. An editor should be described in other terms! Will editors be free? Is there a way for (not huge) product development companies to earn their living? Is standardization limiting development or is it a driving force? If all XML editors must handle paragraphs, tables and equations (and only that) and the styling is limited to CSS and XSL, how can they compete, and what will be the place for creative ideas? Are there things that should be standardized and things that shouldn't?"

The Excosoft XML Client was previously sold under the product name Documentor. Excosoft XML Client is a "tool that enables non-technical business users to create and update XML content; it offers support for open standards such as XML, XSLT, and WebDAV."

[CR: 20020305]

Slotnik, Arnold M. "SGML: The Next Generation [Forecast #2]." [SQUIB] Markup Languages: Theory & Practice 3/1 (Winter 2001) 64. ISSN: 1099-6622, E-ISSN: 1537-2626 [MIT Press].

QDA (Quantum Document Architecture) is an application of the concepts of SGML to the problem of document flexibility and reuse.

[CR: 20020305]

Marston, David; . "OASIS XSLT/XPath Conformance Testing." [STANDARDS REPORT] Markup Languages: Theory & Practice 3/1 (Winter 2001) 65-71. ISSN: 1099-6622, E-ISSN: 1537-2626 [MIT Press]. Author's affiliation: IBM/Lotus Development Corporation. Email: David_Marston@lotus.com.

Abstract: "This article describes the work of the OASIS Technical Committee on XSLT/XPath Conformance Testing. It examines the verbiage of two W3C (World Wide Web Consortium) Recommendations and describes some complexities that arise in interpretation. The deliberate grants of developer discretion are acknowledged, and the impact on testing is discussed. Expectations for contributed test cases are given."

Conclusion: "The W3C Working Group that wrote the Recommendations was careful to state both lower and upper limits on acceptable processor behavior and the XSLT instruction set. XSLT stylesheets should be portable across different XSLT processors, and the creator of a stylesheet, whether software or human, should be assured of this portability. Vendor-neutral conformance testing enforces the independence of stylesheet content and processor behavior that is essential if XSLT is to realize its potential."

See references to W3C specifications for XPath and XSLT. Also for the OASIS XSLT/XPath Conformance Technical Committee, the XSLT/XPath Test Suite Review Tracking Mechanism (Tests Sorted by Status), and the mailing list archives for 'xslt-conformance'.

[CR: 20020305]

Slotnik, Arnold M. "SGML: An Historical Perspective." [SQUIB] Markup Languages: Theory & Practice 3/1 (Winter 2001) 72. ISSN: 1099-6622, E-ISSN: 1537-2626 [MIT Press].

Selection from an ongoing research project to identify the true roots and genealogy of ISO 8879:1986 (Standard Generalized Markup Language).

[CR: 20020305]

Hunting, Sam. "A Simple Property Set for Contract Architectural Forms." [ARTICLE] Markup Languages: Theory & Practice 3/1 (Winter 2001) 73-92 (with 14 references). ISSN: 1099-6622, E-ISSN: 1537-2626 [MIT Press]. Author's affiliation: Email sam_hunting@yahoo.com; also info@etopicality.com.

Abstract: "The contract can be represented by a property set (ISO 10744) that is historically grounded, results- rather than process-oriented, and meets requirements for equity, portability, and verifiability for all notation processors (sentient or not). A set of architectural forms is presented that conforms to the property set."

Because the contract is ubiquitous in commercial life (and thus in life), applications for a contract property set are almost too numerous to be worth mentioning. Therefore, I will simply list a few here: (1) On-line, ready-to-use, boilerplate contracts; (2) Specification for conversion operations; (3) Lending equity to XSL transforms; (4) Electronic commerce; (5) Enterprise modeling; and (6) Semantic overlays to legacy procedural code. These applications may well require different architectures conforming to the contract property set.

Conclusion: "Contracts, because of their power and ubiquity, seem a natural target for an international standards effort using property sets. Property sets provide a simple and very powerful mechanism for representing such complex, real-world relationships."

This paper was originally presented under the title "Architectural Forms in Legal Contracts" at the Metastructures 1998 Conference in Montréal. Abstract: "Legal agreements are ubiquitous and powerful documents that exhibit endless varations on common themes. In complex machine-mediated transactions, such as may occur on the Web, contracts need to express the relationships and contingencies they establish in a machine-readable fashion. SGML and XML information architectures for contracts will be needed, and architectural forms can be used as metastructures that will provide both flexibility and unambiguously interchangeable semantics. Following both the modern "programming by contract" movement and 19th Century formalisms in contract law, the essential components of a contract are found to be: client, supplier, preconditions, postconditions and invariants. For the sake of simplicity, these components are developed and elaborated using a Property Set. Potential applications include electronic commerce, workflow, and the potential disintermediation of the legal community from routine transactions. Limitations and extensions of the formalism are discussed. The appropriate forum for development of a simple contract architecture would be transnational and non-profit." [August 18, 1998]

General references: "Architectural Forms and SGML/XML Architectures."

[CR: 20020305]

Liu, Peiya; Chakraborty, Amit; Hsu, Liang H. "Path Predicate Calculus: Towards a Logic Formalism for Multimedia XML Query Languages." [ARTICLE] Markup Languages: Theory & Practice 3/1 (Winter 2001) 93-106 (with 22 references). ISSN: 1099-6622, E-ISSN: 1537-2626 [MIT Press]. Authors' affiliation: Siemens Corporate Research, Inc..

Abstract: "Many document query languages are currently proposed for specifying document retrieval. But the formalisms for document query languages are still underdeveloped. An adequate formalism is critical for query language development and standardization. Classical formalisms, relational algebra and relational calculus, are used to evaluate the expressive power and completeness of relational query languages. Most relational query languages embed within them either one or a combination of these classical formalisms. However, these formalisms cannot be directly used for tree document query languages due to different underlying data models. In this paper, we propose a logic formalism, called path predicate calculus, based on a tree document model and paths for querying XML. In the path predicate calculus, the atomic logic formulas are element predicates rather than relation predicates as in relational calculus. In this path predicate calculus, queries are equivalent to finding all proofs of the existential closure of logical assertions in the form of path predicates that document elements must satisfy."

Excerpt:

Many useful XML applications require an expressive document query language to support structured multimedia information retrieval. These document retrieval applications occur in many industries. In structured multimedia document queries, not only document content but also document structures must be available for retrieval. The document content may include both static/spatial media (such as text, graphics, drawings, images, etc.) and timebased media (such as video, audio, animation, etc.). The content can be further organized into three major document structures: hierarchical, hyperlinked, and scheduled (including both temporal and spatial). Designing an expressive document query language to cover these many aspects is becoming a challenge to the document community. Query language formalisms are essential for evaluating the expressive power of any proposed query language and for advancing query language design.

In the past, two formalisms have often been used for describing query languages in relational models: (1) an algebraic formalism, called relational algebra, and (2) a logic formalism, called relational calculus, including tuple relational calculus and domain relational calculus. In the algebraic formalism, the queries are expressed by applying special algebraic operators on relations. In the logic formalism, the queries are expressed by describing predicates that relation tuples in the answer must satisfy. A formalism could be also used to serve as a vehicle for evaluating the expressive power and limitations of proposed query languages. Most modern relational query languages embed within them one of these formalisms to specifying queries. Calculus-based relational query languages often provide higher-level declarative characteristics than algebraic languages. However, because their underlying data models differ from the document model, these formalisms for relational query languages cannot be directly used as formalisms for tree document query languages...

The queries specified in path predicate calculus can be applied directly on the XML document model. They are expressed by writing a logical formula that document elements must satisfy. In path predicate calculus, the atomic logic formulae are element predicates for asserting logic statements about document elements in a document tree. This paper will show that many document query operations, such as tree selection, tree join, spatial/temporal operations, etc., can be expressed in such a logic formalism. Thus, document-centered queries can be expressed and studied. The relational calculus is a special case of this logic form, which applies to 'flat' data-oriented documents when element predicates are degenerated into relational predicates, as in relational models. The logic approach has several advantages. It provides 'non-procedurability' of document queries. Algebraic approaches often need to explicitly describe the order of operations on underlying data models to express the queries. The logic formalism provides a higher level notion to express queries since it is based on logical computation in query processing to find all proofs of the existential closure of logic query statements. The path predicate approach can also directly work on the XML document model rather than a specific data model of documents.

The main contributions of this paper are (1) to provide a logic-based formalism, called path predicate calculus, for document query languages. We feel that this direction of research is important to advance query language design, development and standardization. Historically, relational query languages have a similar development path. Calculus-based relational query languages can provide more declarative characteristics than algebraic languages. (2) To formalize document paths and element address in a first-order predicate form for logic-based query languages. Element predicates and path formulas are used to assert logical truth statements about document elements in a document tree for specifying logic-based queries. These predicates and path formulas can also be viewed as a generalized version of relational predicates in relational models when applying to XML tree document model.

Related: "Path predicate calculus: Towards a logic formalism for multimedia XML query languages." By Peiya Liu, Amit Chakraborty, and Liang H. Hsu (Siemens Corporate Research, Inc.) Presented at Extreme Markup Languages 2000, Montréal, August 2000. General references: "XML and Query Languages."

[CR: 20020305]

Rizzi, Romeo. "Complexity of Context-Free Grammars with Exceptions and the Inadequacy of Grammars as Models for XML and SGML." [ARTICLE] Markup Languages: Theory & Practice 3/1 (Winter 2001) 107-116 (with 19 references). ISSN: 1099-6622, E-ISSN: 1537-2626 [MIT Press]. Author's affiliation: Facoltà di Scienze, Dipartimento di Informatica e Telecomunicazioni, Università degli Studi di Trento. Email: romeo@science.unitn.it; WWW.

Abstract: "The Standard Generalized Markup Language (SGML) and the Extensible Markup Language (XML) allow authors to better transmit the semantics in their documents by explicitly specifying the relevant structures in a document or class of documents by means of document type definitions (DTDs). Several authors have proposed to regard DTDs as extended context-free grammars expressed in a notation similar to extended Backus-Naur form. In addition, the SGML standard allows the semantics of content models (the right-hand side of productions) to be modified by exceptions. Inclusion exceptions allow named elements to appear anywhere within the content of a content model, and exclusion exceptions preclude named elements from appearing in the content of a content model. Since XML does not allow exceptions, the problem of exception removal has received much interest recently. Motivated by this, Kilpeläinen and Wood have proved that exceptions do not increase the expressive power of extended context-free grammars and that for each DTD with exceptions, we can obtain a structurally equivalent extended context-free grammar. Since their argument was based on an exponential simulation, they also conjectured that an exponential blow-up in the size of the grammar is a necessary devil when purging exceptions away. We prove their conjecture under the most realistic assumption that NP-complete problems do not admit non-uniform polynomial-time algorithms. Kilpeläinen and Wood also asked whether the parsing problem for extended context-free grammars with exceptions admits efficient algorithmic solution. We show the NP-completeness of the very basic problem: given a string w and a context-free grammar G (not even extended) with exclusion exceptions (no inclusion exceptions needed), decide whether w belongs to the language generated by G . Our results and arguments point up the limitations of using extended context-free grammars as a model of SGML, especially when one is interested in understanding issues related to exceptions."

Final remarks: "The role of XML is to allow documents to be served, received, and processed on the Web. Even though it is now clear that context-free grammars with exceptions have their limits in modeling SGML, the ideas in the proof of Theorem 3 [above] can possibly help in understanding which aspects or ingredients of the exception mechanism should definitely not be included, for efficiency reasons, into the XML standard. Or, at least, we do not know of other formal steps or partial results on this front. Indeed, the use of exceptions begins to appear controversial even for authors, in that, although exceptions are useful and even handy at first, they add significantly to the complexity of authoring DTDs as their size and complexity grows. Even if this phenomenon had been sometimes denied, or attributed to poor style in the use of exceptions, it is shown in [Matzen 1999], on the basis of an empirical analysis, that the complexity of some DTDs is approaching (or has exceeded) manageable limits given existing tools for designing and understanding them, and it is nowadays commonly believed that only partial solutions to this problem can be attempted; see Matzen 1999, Matzen/Hedrick 1998, and Matzen/Hendrick 1997. Our results now provide a formal justification of the occurrence of these problems which are known to imply high costs for DTD design and corresponding problems with quality."

A related paper was published as IRST Technical Report 0101-05, Istituto Trentino di Cultura, January 2001 (December 2000: Centro per La Ricerca Scientifica e Tecnologica, Istituto Trentino di Cultura). See the original Postscript and the online abstract. [cache]

[CR: 20020305]

Wood, Lauren. "Review of Building Oracle XML Applications." [REVIEW] Markup Languages: Theory & Practice 3/1 (Winter 2001) 117-118. ISSN: 1099-6622, E-ISSN: 1537-2626 [MIT Press]. Author's affiliation: Director of Product Technology, SoftQuad Software Inc. Email: lauren@sqwest.bc.ca.

"Steve Muench has written a very useful book for developers who need to understand how to work with XML and an Oracle database. Oracle provides several XML tools, and the book goes into them in some detail (including lots of examples). This is not a book about the pros and cons of using relational databases to store XML; it is, however, a book that developers who work with XML in the context of a relational database will often turn to, at least when they are starting to put it all together. XML is not just the 30-page specification it started out to be; the family of XML-related specifications has been extended to include specifications for transformation, for disambiguating element type names, and for a standard API to the various objects that make up an XML document or datum. Oracle has software tools that implement several XML-related specifications, and these tools are discussed in the book. Oracle has also defined its own XML-related technologies, such as XSQL, which are given equally detailed treatment. Java is the progamming language most heavily used for the examples (along with PL/SQL, Oracle's proprietary language)... I would strongly recommend that developers who use Oracle's XML-related tools have a copy of this book to hand. Developers working with other relational databases would probably also benefit from having some access.

The book: Building Oracle XML Applications, by Steve Muench. Includes a CD-ROM with Oracle JDeveloper 3.1 for Windows NT/2000. Published by O'Reilly & Associates, Inc. ISBN: 1-56592-691-9. Price US$ 44.95.


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Published Issues of MLTP - Tables of Contents