SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors
NEWS
Cover Stories
Articles & Papers
Press Releases
CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG
TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps
EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
|
Markup Languages: Theory and Practice. Volume 1, Number 3: Table of Contents |
Overview
This document contains an annotated Table of Contents for Markup Languages: Theory and Practice, Volume 1, Number 3 (Summer 1999). Markup Languages: Theory and Practice (ISSN: 1099-6622) is published by MIT Press Journals. Editors in Chief for MLTP are B. Tommie Usdin (Mulberry Technologies, Inc.) and C. M. Sperberg-McQueen (University of Illinois/Chicago). A journal description with an overview of the Editorial Structure is provided in a separate document. See also the annotated Table of Contents for Volume 1, Number 1 (Winter 1999) and Volume 1, Number 2 (Spring 1999).
Annotated Table of Contents
[CR: 19991002]
Smith, Joan M.; Usdin, B. Tommie; Sperberg-McQueen, C. Michael. "Interview with Joan Smith." [COMMENTARY and OPINION] Markup Languages: Theory & Practice 1/3 (Summer 1999) 1-6. ISSN: 1099-6621 [MIT Press]. Author's affiliation: [Smith:] Chairman, SGML Technologies Group; [Usdin:] Mulberry Technologies, WWW; [Sperberg-McQueen: University of Illinois at Chicago; WWW].
The editors-in-chief of Markup Languages: Theory & Practice interview Joan Smith, who was instrumental in promoting SGML, especially in Europe.
"Joan Smith is Chairman of the SGML Technologies Group of a pan-European companies, with subsidiaries in Brussels and Luxembourg; the largest group of companies specializing in SGML in Europe. She founded the International SGML Users' Group, has written numerous books and papers on SGML, and has organized conferences on SGML. She has recently been accepted as a Freeman of the Worshipful Company of Information Technologists of the City of London. She is a Fellow of the British Computer Society, a Member of the Institute of Directors, and was the first European to receive the GCA's Tekkie award and later the GCA's International SGML Award."
Note: Some of Joan Smith's publications are referenced in the main bibliography reference collection. A larger number of Smith's
publications -- over ninety-nine (99)! -- are referenced in the larger print
bibliography for SGML and related standards.
[CR: 19991002]
Lubell, Joshua. "Structured Markup on
the Web: A Tale of Two Sites." [ARTICLE]
Markup Languages: Theory & Practice
1/3 (Summer
1999)
7-22 (with 20 references). ISSN: 1099-6621 [MIT Press].
Author's affiliation: National Institute of Standards and Technology, 100 Bureau Drive, Stop 8260,
Gaithersburg, MD 20899-8260, USA.
Tel: +1 (301) 975-3563. Email: lubell@cme.nist.gov; WEB http://www.nist.gov/msidstaff/lubell.htm.
"Businesses and organizations are increasingly finding that HTML (Hyper-Text Markup Language)
offers no help whatsoever in managing the information on their web sites. SGML (Standard Generalized
Markup Language) provides the flexibility and reuse lacking in HTML. However, SGML alone does not
address the problems involved in maintaining online document repositories. Although traditional database
management systems are clumsy at managing hyperlinked documents, a system combining SGML, database
technology, and the protocols of the Web can provide a reasonably robust environment for developing and
maintaining a web site. Two possible site designs employing SGML are discussed and evaluated with respect to
a set of design objectives and choices. The likely impact of the emerging XML (Extensible Markup Language)
standard on web site design is also discussed."
"Sites 1 and 2 illustrate a dilemma that today's web site developers to take advantage of the benefits of
SGML. On the one hand, they can rely heavily on SGML's ability to represent data in an application-specific,
structured manner
and on CGI to dynamically generate browser-ready web output in response to SGML database queries. While
such a site design enables users to quickly find information through application-specific queries and is easier to
maintain than a
collection of HTML documents, it requires extra effort on the part of content providers, additional server
overhead, and the implementation of hyperlinking if links to off-site web pages are desired. On the other hand,
web site developers
may choose to minimize the burden on content providers and to maximize server performance, interoperability
with web search engines, and linkage with other web sites. In this case, they must sacrifice application-specific
structured query
capability and implement tools for managing entities and maintaining hyperlinks. The emerging XML standards
promise to provide web site developers with the best of both worlds, allowing them to enjoy most of the
benefits of SGML
while not sacrificing the convenience of HTML and interoperability with the rest
of the Web. If XML is ultimately successful, not only will it be easier for web site
developers to use SGML, but also they will be able to take advantage of newly
available capabilities to make their content easier for users to read and easier for
web clients and other desktop applications to interpret."
More information about the work discussed in this paper is available on the
Internet at http://www.nist.gov/apde.
[Received 23 June 1998. Revised 21 October 1998.]
A related version of this publication is available online.
Note: Lubell's work on XML includes the PSL (Process Specification Language) Project. Preliminary findings describing how the PSL semantic concepts may be mapped to the eXtensible Markup Language (XML) is now available. For references, see "Process Specification Language (PSL) and XML."
[CR: 19991002]
Cameron, Robert D. "REX: XML
Shallow Parsing with Regular Expressions." [ARTICLE]
Markup Languages: Theory & Practice
1/3 (Summer
1999)
61-88 (with 5 references, 3 appendices). ISSN: 1099-6621 [MIT Press].
Author's affiliation: Professor, School of Computing Science at Simon Fraser University; Associate Dean of the Faculty of Applied Sciences, SFU..
"The syntax of XML is simple enough that it is possible to parse an XML document into a list of its
markup and text items using a single regular expression. Such a shallow parse of an XML document can be very
useful for the construction of a variety of lightweight XML processing tools. However, complex regular
expressions can be difficult to construct and even more difficult to read. Using a form of literate programming
for regular expressions, this paper documents a set of XML shallow parsing expressions that can be used as a
basis for simple, correct, efficient, robust and language-independent XML shallow parsing. Complete shallow
parser implementations of less than 50 lines each in Perl, JavaScript and Lex/Flex are also given."
[From the conclusion:] "The simplicity of the shallow parsing model based on regular expressions
suggest suggests some interesting possible directions for development of XML.
First of all, a shallow parsing representation such as that produced by REX could
be a useful reference representation for a revised XML specification. Such a refer-ence
representation would have the advantage of providing a language-independent
approach to shallow parsing encoded in the standard, with a
language-independent implementation framework based on regular expressions.
Furthermore, it may be possible to relax certain XML restrictions that can be
easily accommodated by regular-expression processing, such as the restriction
that attributed values must always be quoted. However, possibilities such as these must be carefully weighed by the overall XML development community."
[CR: 19991002]
Mikheev,Andrei; Grover, Claire; Moens, Marc. "XML Tools And Architecture for Named Entity Recognition." [ARTICLE] Markup Languages: Theory & Practice 1/3 (Summer 1999) 89-113 (with 13 references). ISSN: 1099-6621 [MIT Press].
Authors' affiliation: University of Edinburgh, HCRC Language Technology Group. 2 Buccleuch Place,
Edinburgh EH8 9LW, UK.
[Mikheev:] mikheev@harlequin.co.uk; [Grover:] C.Grover@ed.ac.uk; Marc [Moens:] M.Moens@ed.ac.uk.
"Named Entity recognition involves identifying expressions which refer to (for example) people,
organizations, locations, or artifacts in texts. This paper reports on the development of a Named Entity
recognition system developed fully within the XML paradigm. In the section 'Named Entity recognition' we
describe the nature of the Named Entity recognition task and the complexities involved. The system we
developed was entered as part of a DARPA-sponsored competition, and we will
briefly describe the nature of that competition.
We then give an overview of the design philosophy behind our Named Entity
recognition system and describe the various XML tools that were used both in the
development of the system and that make up the runtime system (section "LTG
text handling tools"), and give a detailed description of how these tools were used
to recognize temporal and numerical expressions (section "TIMEX, NUMEX")
and names of people, organizations and locations (section "ENAMEX"). We conclude
with a description of the results we achieved in the competition, and
how these compare to other systems (section 'Conclusion), and give details on
the availability of the system (section Availability').
[System description:] "One of the design features of the system which sets it apart from other Named
Entity recognition systems is that it is designed fully within the SGML paradigm:
the system is composed of several tools which are connected via a pipeline with
data encoded in SGML or XML. This allows the same tool to apply different
strategies to different parts of the texts using different resources. The tools do not
convert from SGML into an internal format and back, but operate at the SGML
or XML level. Our system does not rely heavily on lists or gazetteers but instead treats
information from such lists as "likely" and concentrates on finding contexts in
which such likely expressions are definite. In fact, the first phase of the enamex
analysis uses virtually no lists but still achieves substantial recall. The system is document centered. This
means that at each stage the system
makes decisions according to a confidence level that is specific to that processing
stage, and draws on information from other parts of the document. The system is
hybrid, applying symbolic rules and statistical partial matching techniques in an
interleaved fashion. A runtime version of the system described here is available for free at
http://www.ltg.ed.ac.uk/software/ne/.
We also have a set of tools which can be used to develop a Named Entity
recognition system. The tool suite is called LT TTT, and is available from
http://www.ltg.ed.ac.uk/software/ttt/. LT TTT consists of
lttok, ltstop
and fsgmatch, a number of resource files for tokenization, for end-of-sentence
disambiguation, and for the recognition of temporal expressions, and tools for
extending these resource grammars or for creating new ones.
It also has a visual interface which uses XSL style sheets to render the XML
Named Entity annotation in a form that is easier to inspect.
The part of speech tagger is available as a separate tool. See
http://www.ltg.ed.ac.uk/software/pos/.
[Received 6 March 1999, Accepted 26 May 1999.]
[CR: 19991002]
Tidwell, Doug. "IBM's TaskGuide: An
XML-Based System for Creating Wizard-Style Helps." [PROJECT REPORT]
Markup Languages: Theory & Practice
1/3 (Summer
1999)
23-39. ISSN: 1099-6621 [MIT
Press].
Author's affiliation: Advisory Programmer, IBM Corporation, Human Interface Group. E20D/500, P.O. Box 12195, Research Triangle Park, NC 27709. Tel: 1+ (919) 254-5128; FAX 1+ (919) 543-4118; Email: dtidwell@us.ibm.com.
"Wizards have been a part of workstation products since the early 1990s. A wizard
is a task-oriented dialog that guides the user through a given task, automating as
much of that task as possible. A typical wizard panel has a graphic area on the
left, a set of navigation buttons on the bottom, and an area on the right that
contains any text and controls needed for the task at hand."
"IBM's TaskGuide technology gives Technical Writers and Human Factors professionals the ability to create wizards. Based on the premise that task analysis is the most difficult part of creating an effective wizard, our tools let you focus on design, not writing code. This paper discusses the basics of wizard technology, followed by a discussion of the XML-based system we have created. We cover some of the key design decisions we had to make, and introduce some of the unique features of our product. We also discuss the changes we have made to our product as technology has changed around us. Finally, we demonstrate a recursive document, a wizard that creates another wizard."
"IBM's TaskGuide technology allows technical writers to create wizard panels
without programming. These panels are created dynamically based on the
information in wizard scripts. Our approach lets wizard writers focus on the truly
difficult tasks of task analysis and technical writing, rather than on the mundane
aspects of programming a graphical interface. As our technology has grown over
time, the basic skills learned to create wizards with our first driver are still useful
and effective today."
[Received 3 July 1998.]
[CR: 19991002]
Catteau, Tom . "An SGML System for the Budget of the European Union." [PROJECT REPORT]
Markup Languages: Theory & Practice
1/3 (Summer
1999)
41-59 (with 3 references). ISSN: 1099-6621 [MIT
Press].
Author's affiliation: Software Engineer, SGML Technologies Group. 29 Boulevard Général Wahis, B-1030 Brussels
Belgium. Email: tct@sgmltech.com; WEB http://www.sgmltech.com.
Tel: +32 2 705 70 21; FAX +32 2 705 81 01.
"In this paper, the system used for the editorial process of the European Union's budget is described, both from a functional and a technical point of view. It will be shown how the choice of SGML as the key technology has had an impact on the overall architecture as well as on individual modules which constitute the system. The description is based on the current status of the system. Future developments are discussed briefly."
"The editorial process of the budget of the European Union is an annual, on-going process in which different players such as authors, translators, reviewers and a printer all operate in a common environment to enter, translate, and review data needed to produce the budget. The budget itself is published on paper and on the Web. The system, designed to fulfill requirements for the timely delivery of high-quality documents, together with short production times, and hence minimized costs, is entirely SGML-based. It has evolved to a complete and mature production environment. In this paper an overview of the architecture of the system is given as well as a description of the rationale behind the key technical choices that were made. It highlights certain aspects of SGML, such as concurrency and links, which are explained by illustrating their use in the budget application. The need for reliability and stability is shown to have led to a client/server system in which SGML acts as the backbone of the modules which govern the production workflow. These modules communicate with each other through SGML-formatted messages. This application has been made possible through the use of a full-featured SGML parser and an associated application language that combine to make a powerful SGML engine. In a final section, future developments, some of which are currently being developed, are briefly discussed."
[Received 23 June 1998. Revised 5 August 1998. Accepted 27 July 1998.]
See "The European Union's Budget: SGML Used to its Full Potential." by Tom Catteau. In Conference Proceedings of SGML '97, pages 645-653. Other research papers from the group are available.
[CR: 19991002]
Graham, Tony. "Whither &?" [SQUIB] Markup Languages: Theory & Practice 1/3 (Summer 1999) 40. ISSN: 1099-6621 [MIT Press].
Author's affiliation: Mulberry Technologies; Home Page.
"The declarations for predefined & and < entities provided in section 4.6, Predefined Entities, of the XML Recommendation may be confusing at first sight because the leading ampersand in each numeric character reference is itself escaped as a complete numeric character reference. [shows how <!ENTITY my-amp "&#38;"> will eventually yield strings like "AT&T" (internally) in an application after reparsing...]
[CR: 19991002]
Piez, Wendell. "Review of The XML Companion, by Neil Bradley." [BOOK REVIEW]
Markup Languages: Theory & Practice
1/3 (Summer
1999)
114. ISSN: 1099-6621 [MIT
Press].
Author's affiliation: Mulberry Technologies; WWW.
"Neil Bradley has been working with generic markup applications for over ten years; his offering, The XML Companion, benefits accordingly. His treatment covers the same range of issues as other overviews, but the text itself is refreshingly free of statements of unanchored principle (what XML 'should' be) and
prognostication, instead presenting the actual state of things and concentrating on
what is known by markup practitioners to work. Likewise, he is much more accurate
and forthright than many other general references in indicating which technologies
are stable (for example, the DTD syntax of XML 1.0 is not subject to
change and will not suddenly be replaced by 'XML-Data', even while a new
schema language is in the works) and which are soft or still under development
(like XSL). He is also more consistently successful in exposing core ideas, rather
than depending on examples (plucked from wherever) to be self-explanatory. . ."
References: The XML Companion. Harlow, Essex: Addison Wesley Longman, 1998. Extent: 464 pages. ISBN: 0-201-41999-8.
[CR: 19991002]
Piez, Wendell. "Review of XML: The Annotated Specification, by Bob DuCharme." [BOOK REVIEW]
Markup Languages: Theory & Practice
1/3 (Summer
1999)
115. ISSN: 1099-6621 [MIT
Press].
Author's affiliation: Mulberry Technologies; WWW.
"XML: The Annotated Specification is the shortest and most manageable of the books under review, and the quality of information in it is good; its scope is also narrower. Unlike the other books, it is not a general reference; Bob DuCharme concentrates exclusively on the syntax of XML languages (both instance and DTD syntaxes) as defined in the February 1998 W3C Recommendation (which appears in the book verbatim, intermixed with commentary). DuCharme, while not a member of the committee that wrote the specification itself, was party to discussions about its design when it was in progress, and is thus in a good position to present an interpretation without compromising the specification's 'actual meaning'. This book will be of greatest interest and most benefit, naturally, to the technical user who has a reason to be concerned with details of the standard itself, rather than with one or another implementation or application of it. . ."
References: Bob DuCharme. XML: The Annotated Specification. The Charles F. Goldfarb Series on Open Information Management. The Definitive XML Series from Charles F. Goldfarb. Upper Saddle River, NJ: Prentice Hall PTR, 1999. Extent: xx + 339 pages. ISBN: 0-13-082676-6.
[CR: 19991002]
Piez, Wendell. "Review of XML In Plain English, by Sandra E. Eddy." [BOOK REVIEW]
Markup Languages: Theory & Practice
1/3 (Summer
1999)
116. ISSN: 1099-6621 [MIT
Press].
Author's affiliation: Mulberry Technologies; WWW.
"XML In Plain English is a digest of information from available specifications presented in directory form, so that one could, for example, look up 'children' in the XML Syntax section and find out how the XML Specification uses the term. Included are sections on XML Syntax (information derived from the February 1998 XML Specification), XLink and XPointer (1998 Working Drafts), Cascading Style Sheets (CSS1 and CSS2), the DSSSL-O subset of DSSSL (August 1998), Appendixes on Unicode and XML Editors and Utilities, and a Glossary. . ."
[CR: 19991002]
Piez, Wendell. "Review of The XML Black Book, by Natanya Pitts-Moultis and Cheryl Krik." [BOOK REVIEW]
Markup Languages: Theory & Practice
1/3 (Summer
1999)
117. ISSN: 1099-6621 [MIT Press].
Author's affiliation: Mulberry Technologies; WWW.
Pitts-Moultis and Kirk's XML Black Book, billed as a 'comprehensive reference', tries to cover the full range of XML-related issues. It contains six parts, variously approaching high- and low-level problems of document modeling, system design and implementation, style sheet technologies, application development and so on. Within these parts the chapters, with titles like 'Implementing XML in a Corporate Environment' or 'Creating Content in XML', each contain an 'In Depth' and an 'Immediate Solutions' section. . ."
Also in this issue of MLTP:
|
| Receive daily news updates from Managing Editor, Robin Cover.
|
|