CP RSS Channel
About Our Sponsors
Articles & Papers
Technology and Society
|Markup Languages: Theory and Practice. Volume 1, Number 1: Table of Contents|
This document contains an annotated Table of Contents for Markup Languages: Theory and Practice, Volume 1, Number 1 (Winter 1999). The inaugural issue of MLTP begins with an introductory commentary by the editors C. Michael Sperberg-McQueen and B. Tommie Usdin, welcoming the readership and outlining the objectives of the new journal. The lead feature article by Steve DeRose and Andries van Dam charts the course for a future generation of intelligent hypertext software - by taking a look back at the FRESS system and some other computer-aided hypertext systems built in the 1960s. An article by Richard Matzen analyzes the problems of DTD complexity in the case of DTDs that use exceptions, proposing models and software tools to deal with these issues. An article by José Carlos Ramalho and colleagues at the University of Minho, Portugal, addresses what is now recognized as a serious limitation in SGML(-based) markup languages: that content validity in marked-up documents is at risk because these markup languages can neither formally express nor validate primitive 'ontological' and relational semantics governing the encoded information. Alan Karben presents a wonderful case study and success story involving the content-reuse system of The Wall Street Journal Interactive Edition, which makes extensive use of SGML and XML. Lauren Wood 'describes the motivation behind the work on the W3C DOM, as well as the rationale behind some of the design decisions.' In a book review essay entitled "Structure Rules!," Chet Ensign weighs in on the side of DTDs that 'Matter After All.' Shorter review articles and squibs include Tony Graham's "Character Set Refresher" and annotated Tables of Contents by Deborah A. Lapeyre supporting Chet Ensign's book review essay.
[March 09, 1999] Another review: "A Tool for Building Digital Libraries." By Martha Anderson. In D-Lib Magazine Volume 5, Number 2 (February 1999). Martha Anderson (Senior Digital Conversion Specialist, National Digital Library Program, Library of Congress) provides a Journal Review article for Markup Languages: Theory & Practice, Volume 1, Number 1 (1998, MIT Press), edited by C.M. Sperberg-McQueen and B. Tommie Usdin. "With detailed technical descriptions and the assumption that readers already know and use markup, Markup Languages promises to be a source of solid practical experience and thought provoking ideas. The journal will be a welcome arrival in the mailboxes of those who have rolled up their shirtsleeves and are up to their elbows in the nitty-gritty work of making data accessible and usable.
[May 28, 1999] See also now the annotated Table of Contents for Markup Languages: Theory and Practice, Volume 1, Number 2 (Spring 1999).
Sperberg-McQueen, C. Michael; Usdin, B. Tommie. "Welcome to Markup Languages: Theory & Practice." Markup Languages: Theory & Practice 1/1 (Winter 1999) 1-6. ISSN: 1099-6621 [MIT Press]. Authors' affiliation: [Sperberg-McQueen:] Senior Research Programmer, University of Illinois at Chicago; Email: firstname.lastname@example.org; [Usdin:] President, Mulberry Technologies Inc.; Email: email@example.com; WWW: http://www.mulberrytech.com.
Abstract: "In this introductory 'Commentary and Opinion' essay, the "editors of the journal describe why they and publisher decided to start the journal, and what they hope to accomplish."
'Markup Languages: Theory & Practice is a peer-reviewed technical journal publishing papers on research, development, and practical applications of text markup for computer processing, management, manipulation, and/or display. The scope of the journal includes: 1) design and refinement of systems for text markup and document processing; 2) specific text markup languages; 3) theory of markup design and use; 4) applications of text markup; 5) languages for the manipulation of marked up text.'
"The scope of the journal is wide enough to include current and future markup applications but is designed to limit the subject scope sufficiently to make the journal coherent. As may be seen, the journal is not limited to SGML and XML and their applications, though we believe them to be markup languages of considerable interest. SGML was not the first, and XML is unlikely to be the last, language of their kind; we hope this journal will prove a useful forum for discussions of design and implementation issues relating to markup languages present, past, and future. We hope Markup Languages: Theory & Practice will be equally hospitable to articles on theory and articles on practice. In the field of markup languages, theoretical questions may have immediate and obvious practical implications, and practical problems often raise profound and important theoretical issues. The best theorists continually learn from practical experience; the best implementers realize that there is 'nothing so practical as a good theory'."
"Markup Languages: Theory & Practice will include material of a variety of categories, including: 1) articles: especially on theoretical and practical aspects of markup or markup usage; 2) announcements: describing events or activities, especially future events likely to be of interest to our readers; 3) commentary and opinion: essays, such as this one, consisting primarily of the authors' opinions; 4) practice notes: discussions of common practice, suggestions for improved standard practice, or comparisons of methods for achieving similar goals; 5) project reports: descriptions of a project or application reviews: discussion and description of books, software, web sites, etc. that may take the form of essays, short narrative reviews, or annotated tables of contents; 6) squibs: short (from one to a few pages) statements of fact, descriptions of problems, or anecdotes; 7) standards reports: discussions of any of the ever growing set of standards relating to markup."
DeRose, Steven J; van Dam, Andries. "Document Structure and Markup in the FRESS Hypertext System [Alias: 'The Lost Books of Hypertext.']." Markup Languages: Theory & Practice 1/1 (Winter 1999) 7-32 (with 60 references). ISSN: 1099-6621 [MIT Press]. Authors' affiliation: [DeRose:] Inso Corporation and Brown University; Email: Steven_DeRose@Brown.edu; WWW: http://www.stg.brown.edu/~sjd; Tel +1 (401) 863-3690; FAX +1 (401) 863-9313; [van Dam:] Thomas J. Watson, Jr., University Professor of Technology and Education, and Professor of Computer Science, Brown University, Providence, RI; Email: firstname.lastname@example.org.
Abstract: "The earliest computer-aided hypertext systems were built in the 1960s, and (unlike some of the most popular later systems) fully integrated it with their hypertext functionality. Brown University's FRESS was the first hypertext system to run on commercial hardware and OS. It actually handled complex documents better than non-hypertext systems, and so was used as a publishing system as well as a collaborative hypertext environment for teaching, research, and development. FRESS had considerable support for document structuring and markup, affording separation of structure from formatting and hypertext semantics. It also provided a variety of coordinated views and a very powerful conditional-structure and view-specification mechanisms that suited it for many tasks still considered hard today: dynamic document assembly, structured information retrieval, and on-the-fly customization of even very large documents for the user, display device, and context. This paper gives an overview of FRESS's design approach especially with regard to its treatment of markup and structure; it discusses some ways that document structures differ from other familiar information structures; and argues that a sophisticated model of document structure is necessary to realize fully the potential of hypertext."
Summary: "Current hypertext (or synonymously 'hypermedia') systems have revolutionized our computing environment. Nevertheless some of the most widely used ones lack some effective capabilities provided in first generation systems such as Augment and FRESS, particularly with regard to document structure. In this article we have discussed the particulars of FRESS, and the markup, structure, and hyperlinking models it implemented, in hopes of showing how they can still benefit hypermedia systems today. . .These examples, we think, show that an effective hypertext system need not sacrifice display sophistication or support for document structure in exchange for linking, but can and should exceed the capabilities of non-hypertextual word processors. Systems that do not support large structured document nodes, and integrate that support fully with their linking models, cannot do this effectively."
"Current hypermedia system designers would do well to re-examine the insights of first-generation systems and take advantage of features that have proven useful (of course improving on them as well). Some of the innovations of the earliest hypertext systems remain available or are even standard now; among these are
Undo and explainers, both introduced by FRESS. But others equally useful are now rare (at least in commercial as opposed to research systems). These may include bidirectionality, typed links, keyword-based content and link filtering, alternate views, links that control their destination context and formatting specifications, and virtual and structured documents and links. Only when hypertext systems address the full range of complexities of real-life documents and their structures, will it be practical to bring pre-existing literature into the hypertextual world, or to build fully effective hypertext systems even for information newly crafted for that world."
Matzen, Richard Walter. "A New Generation of Tools for SGML." Markup Languages: Theory & Practice 1/1 (Winter 1999) 47-74 (with 21 references). ISSN: 1099-6621 [MIT Press]. Author's affiliation: Visiting Assistant Professor, Department of Computer Science, Oklahoma State University; Email: email@example.com; WWW: http://b.cs.okstate.edu/ or http://www.cs.okstate.edu/~matzen/.
Abstract: "Exceptions are used in many standard DTDs, including HTML, because they add expressive power for DTD authors. However, there is a tradeoff: although they are useful, exceptions add significantly to the complexity of DTDs. Authoring DTDs is a difficult task, and existing tools are of limited use because of the lack of a suitable formal model for exceptions. This paper describes methods for constructing a static model that completely and precisely describes DTDs with exceptions. A software tool has been written to implement the methods and to demonstrate some practical applications. Examples are shown of how the tool is used for DTD authoring, and some useful extensions of the tool are described. For one example DTD, the output of the tool is converted into a regular expression grammar. Preliminary studies indicate that general case algorithms can be developed for this conversion. This would allow existing theory for the context free languages to be used in developing SGML applications. Statistical results are shown from running the software tool on a number of industry and government DTDs and for three successive versions of HTML. The results illustrate that the complexity of DTDs in practice is approaching, or has exceeded, manageable limits with existing tools. The formal model and its applications are needed for SGML and continued development of these methods may impact the evolution of HTML, XML, and related web publishing standards. Some specific projects are proposed, where continued development of the model can result in more powerful tools and new kinds of applications for SGML."
[Conclusion: The paper provides evidence to illustrate] "the complexity of DTDs with exceptions, which in turn implies high costs for DTD design and corresponding problems with quality. These results also show that the complexity of some DTDs is approaching (or has exceeded) manageable limits given existing tools for designing and understanding them. There is clearly a need for more powerful tools for DTD design and analysis and for subsequent SGML processing. The software tool described in this paper is useful for understanding (viewing) DTDs with exceptions and for detecting errors caused by the incorrect use of exceptions. Several practical extensions of the tool are described that provide other new capabilities for DTD analysis. Because exceptions are an integral part of SGML, any generalized SGML tool must support them. There are previous theoretical results for formal language models of DTDs with exceptions ([Matzen, "Model"]; [Kilpeläinen and Wood, "SGML and Exceptions"]). However, this is the first description of an implementation, and thus it provides a foundation for a new generation of applications and tools."
"The expanded DTDs output by the software tool are a powerful extension of
the model; these can be used to construct DTDs without exceptions that are
pseudo-equivalent to the original DTDs with exceptions. This allows authors to
design DTDs using the expressive power of exceptions while managing their side-effects. Also, the methods shown for converting DTDs with exceptions to regular expression grammars provide a powerful formal foundation, the existing theory for the context free languages, to be used in developing new kinds of SGML applications. The continued development of the methods and tools described in this paper can be a significant factor in the future success of SGML, and they would affect the evolution of HTML, XML, and other standards for the World Wide Web."
The document is available online in PDF format - "A new generation of tools for SGML." [local archive copy] See also: "SGML exceptions analysis" (results from running the prototype software tool described in "A New Tool for SGML with Applications for the World Wide Web," Proceedings of the 1998 ACM Symposium on Applied Computing, February, 1998).
Revision: Received 22 June 1998, Revised 31 July 1998.
Ramalho, José Carlos; Rocha, Jorge Gustavo; Almeida, José João; Henriques, Pedro. "SGML Documents. Where Does Quality Go?" Markup Languages: Theory & Practice 1/1 (Winter 1999) 75-90 (with 9 references) . ISSN: 1099-6621 [MIT Press]. Authors' affiliation: [Ramalho:] University of Minho, Portugal; Email: firstname.lastname@example.org; WWW: www.di.uminho.pt/~jcr; [Rocha:] Email: email@example.com; WWW: www.di.uminho.pt/~jgr; [Almeida:] Email: firstname.lastname@example.org; WWW: www.di.uminho.pt/~jj; [Henriques:] Email: email@example.com; WWW: www.di.uminho.pt/~prh.
Abstract: "Quality control in electronic publications should be one of the major concerns of everyone who is managing a big project, like a digital library. Collecting information from several different sources raises problems of quality assurance. With SGML we can solve part of the problem, structural/syntactic correctness. There are situations where pre-conditions over the information being introduced should be enforced in order to prevent the user from introducing erroneous data; we shall call this process data semantics validation. In this paper we present ways of associating a constraint language with the SGML model. We present the steps towards the implementation of that language. In the end, we present a new SGML authoring and processing model which has an extra validation task: semantic validation. We also describe some cases in which quality could be improved with this new working scheme."
[Conclusion:] "Our main concern in the work reported in this article was the improvement of quality control in SGML-based electronic processing. In this context we discussed a new SGML authoring and processing model to remedy the lack of semantic validation in the traditional SGML model. The main idea was to restrict the values that the user can enter, by associating constraints with the element definitions. This way we can minimize data incorrectness and improve document quality. Through the use of several examples we illustrated the main problems in the implementation of such semantic validation task: data normalization, type inference, and the definition of a constraint language. . ."
Karben, Alan. "News You Can Reuse. Content Repurposing at The Wall Street Journal Interactive Edition [Project Report]." Markup Languages: Theory & Practice 1/1 (Winter 1999) 33-45. ISSN: 1099-6621 [MIT Press]. Author's affiliation: Associate Director, Interactive Development, The Wall Street Journal Interactive Edition; Email: firstname.lastname@example.org; WWW: http://wsj.com; Tel: +1 (212) 416-2975 FAX: +1 (212) 416 3291.
Abstract: "The content-reuse system of The Wall Street Journal Interactive Edition makes extensive use of SGML and XML to reorganize and reformat the content presented in the main wsj.com website. This paper discusses how the structures that define an Interactive Journal edition and its component articles are queried, processed, and converted by automatically triggered content-processors, allowing us to quickly fill requests by potential publishing partners to feature our branded content in their contexts."
[Conclusion:] '. . . All of our content-reuse processes owe their flexibility and ease of implementation to our use of SGML and XML. Articles created in SGML have been translated and served out in all sorts of flavors of HTML and other plain text formats. Edition structures and configuration files specified in XML are processed and tailored by custom software that allows our editors to specify what constitutes a mini-edition. And when our automatically generated content falls short of serving their audiences completely, an editor can step in and finish the job. . . . Our editors and designers are charged with constantly improving how our news can be accessed, navigated through, presented, and used. And our business-development staff is constantly seeking new ways to raise the visibility of our brand, which often means spreading excerpts from our trove of content out to places and platforms that our primary web site would not otherwise reach. Having our news, and the processes that direct where that news belongs, in an extensible format has proved to be the key to fulfilling their requirements.'
The document is available online in PDF format - "News you can reuse." [local archive copy]
Revision: Received 7 July 1998, Revised 12 August 1998.
Graham, Tony. "Character Set Refresher [Squib]." Markup Languages: Theory & Practice 1/1 (Winter 1999) 46. ISSN: 1099-6621 [MIT Press]. Author's affiliation: Mulberry Technologies Inc.; Email: email@example.com; WWW: http://www.mulberrytech.com.
Abstract: "It is unfortunately easy to confuse the terms that SGML uses when discussing characters and character sets. This graphic illustrates the relationships of characters, character sets, and related concepts. [Illustration and definitions for character, character repertoire, code set, character set, code set position, coded representation, character number].
Wood, Lauren. "Programming Marked-Up Documents." Markup Languages: Theory & Practice 1/1 (Winter 1999) 91-100. ISSN: 1099-6621 [MIT Press]. Author's affiliation: Technical Product Manager, SoftQuad, Inc.; Email: firstname.lastname@example.org; WWW: www.softquad.com.
Abstract: "The Document Object Model is a programming interface to HTML and XML documents. The level 1 DOM specification enables application writers to access, navigate, and manipulate the content and structure of HTML and XML documents. The paper describes the motivation behind the work on the DOM, as well as the rationale behind some of the design decisions. A precis of future work is given."
Revision: Received 29 June 1998, Revised 8 September 1998.
Ensign, Chet. "Structure Rules! Why DTDs Matter After All." Markup Languages: Theory & Practice 1/1 (Winter 1999) 101-112. ISSN: 1099-6621 [MIT Press]. Author's affiliation: Manager, Data Architecture, Matthew Bender & Company, Inc.; Email: email@example.com; WWW: www.bender.com; Tel: +1 212-448-2466; FAX: 212-448-2469.
The first nine pages and the conclusion of this book review essay present the case for SGML/XML DTDs - despite the notion of (mere) "well-formedness" in XML. Ensign argues that the temptation to characterize XML as "SGML without the DTD" represents a misunderstanding of the intent in the XML specification.
Abstract: "Extensible Markup Language or XML, a simpler form of SGML, has introduced the concept of a well-formed document, one that doesn't need a DTD. This sounds wonderful - all the benefits of SGML without the expense or restrictions of that darned DTD. But to dismiss DTDs as arbitrary, restrictive constraints on creative freedom is to miss their role as documentation of the truly valuable parts/aspects of your content and as descriptions that enable users to build lights-out processing systems for leveraging that content. Assuming your goal is to make your content more generally useful and to make it valuable across your entire enterprise, you will quickly find that DTDs are not enemies but allies. And once you make that discovery, you'll want these two books in your library: (1) Eve Maler and Jeanne El Andaloussi Developing SGML DTDs: From Text To Model To Markup, and (2) David Megginson, Structuring XML Documents."
The document is available online in PDF format - "Structure rules! Why DTDs matter after all." [local archive copy]
Lapeyre, Deborah Aleyne. "Annotated Table of Contents. Eve Maler and Jeanne El Andaloussi, Developing SGML DTDs: From Text To Model To Markup." Markup Languages: Theory & Practice 1/1 (Winter 1999) 113-115. ISSN: 1099-6621 [MIT Press]. Author's affiliation: Vice President, Mulberry Technologies Inc.; Email: firstname.lastname@example.org; WWW: http://www.mulberrytech.com.
The annotated Table of Contents for Developing SGML DTDs complements the corresponding book review article by Chet Ensign, also published in this issue of Markup Languages: Theory & Practice.
Lapeyre, Deborah Aleyne. "Annotated Table of Contents. David Megginson, Structuring XML Documents." Markup Languages: Theory & Practice 1/1 (Winter 1999) 116-118. ISSN: 1099-6621 [MIT Press]. Author's affiliation: Vice President, Mulberry Technologies Inc.; Email: email@example.com; WWW: http://www.mulberrytech.com.
The annotated Table of Contents for Megginson's Structuring XML Documents complements the corresponding book review article by Chet Ensign, also published in this issue of Markup Languages: Theory & Practice.
|Receive daily news updates from Managing Editor, Robin Cover.|