Structuring XML Documents
by David Megginson
[Volume Description and Table of Contents]
Megginson, David. Structuring XML Documents. Charles F. Goldfarb Series on Open Information Management. [Subseries:] The Definitive XML Series from Charles F. Goldfarb. Upper Saddle River, NJ: Prentice Hall PTR, [March] 1998. Extent: xxxviii + 425 pages, CDROM. ISBN: 0-13-642299-3. Price: US $39.95.
A volume description and provisional Table of Contents for David Megginson's book Structuring XML Documents are provided below. See the full bibliography entry for a publisher's description of the work and other details; see also the "Prentice-Hall SGML Series" web page. David Megginson is the senior architect with Microstar Software Ltd., principal in Megginson Technologies Ltd.), and is the design lead for SAX, the Simple API for XML, a common event-based XML API now in use by many parsers and applications. Other published works by Megginson are listed on the author's Home Page. He may be contacted by email at firstname.lastname@example.org.
Structuring XML Documents is not a beginner's tutorial on XML, but a book written on an intermediate/advanced level, designed to help applications designers build XML/SGML DTDs that work in real-world document systems. The author interacts rigorously with five major industry-standard DTDs -- ISO 12083, DocBook, Text-Encoding Initiative (TEI), MIL-STD-38784 (CALS), Hypertext Markup Language (HTML 4.0) -- to illustrate how the necessary customizations and extensions can be implemented to support enterprise document processing objectives.
Structuring XML Documents is designed to help users apply XML and SGML to solve their document structuring problems. Specifically, readers will learn to: "1) analyze DTDs and adapt them for their specific processing needs; 2) build DTDs that are easier for others to learn, use, and process; 3) ensure structural compatability throughout their collection of enterprise DTDs; 4) use the new Architectural Forms standard to simplify complex DTD problems." [adapted from the back cover] The book's primary features, according to the front cover description: "1) Covers XML and Full SGML; 2) [Provides] the Expert's Guide to DTD Development; 3) [Helps] Leverage the Power of Architectural Forms; 4) Up to Date: Based on XML 1.0; 5) Companion CD-ROM Includes State-of-the-Art DTDs Plus XML Parsing Tools."
Structuring XML Documents is organized in four major parts:Part 1: Background, Part 2: Principles of DTD Analysis, Part 3: Advanced Issues in DTD Maintenance and Design, Part 4: DTD Design with Architectural Forms. Part 1 provides the reader with a review of XML/SGML DTD syntax sufficient to support an understanding of advanced topics treated in the remainder of the book; it also introduces the five industry DTDs that are to be used as models elsewhere in the book. Part 2 develops general principles for design and analysis using XML/SGML DTDs, as applicable to the collaborative work of writers, editors, and engineers. Part 3 of the book examines advanced topics in DTD design and maintenance, including: building compatibility between various versions of DTDs, document disassembly and reassembly, and DTD customization. Part 4, "DTD Design with Architectural Forms," illustrates the use of Architectural Forms and architecture processing relevant to SGML/XML documents, as recently standardized in the SGML Extended Facilities. The three chapters of Part 4 introduce Architectural Form processing, explain the most important features of the syntax, and address some advanced architectural-form constructs for difficult situations. In addition to a General Index (Appendix B), Appendix A of the book provides a detailed method for accessing the elements and attributes discussed in the industry DTDs: "Model DTDs: Index of Element Types and Attributes."
The companion CDROM for Structuring XML Documents provides several resources which enhance the value and usefulness of the book: 1) Free XML/SGML software and a live parsing demo; 2) HTML (live) links for the latest information on XML and SGML at the time of release; 3) Index of URLs mentioned in the book, organized by chapter; 4) Information on the five model DTDs used in the book, with links to local copies of four of them; 5) The standard ISO character entity sets for SGML.
The sub-series title of Structuring XML Documents -- "The Definitive XML Series from Charles F. Goldfarb" -- reflects the recent bifurcation of the primary Goldfarb series ("The Charles F. Goldfarb Series on Open Information Management"), at least for categorization, into "XML Titles" and "SGML Titles." In this schema, Megginson's book and The SGML Buyer's Guide are member of the former set, along with other XML titles "coming soon" from well-recognized authors: XML by Example, by Sean McGrath; The XML Handbook, by Charles Goldfarb and Paul Prescod; Designing XML Internet Applications, by Michael Leventhal, David Lewis and Matthew Fuchs; The XML and SGML Cookbook: Recipes for Structured Information, by Rick Jelliffe. The subseries description, as printed reads: "As XML is a subset of SGML, the Series List is categorized to show the degree to which a title applies to XML. 'XML Titles' are those that discuss XML explicitly and may alkso cover full SGML. 'SGML Titles' do not mention XML per se, but the principles covered may apply to XML."
Foreword, by Charles F. Goldfarb 0. Introduction 0.1. XML and SGML 0.2. The Book's Structure 0.3. Notations and Conventions 0.3.1. Presentation of Examples 0.3.2. Typographical Conventions Part 1: Background Chapter 1. Review of DTD Syntax 1.1. Document type declaration 1.2. Elements 1.2.1. Element Type 1.2.2. Content Specification 18.104.22.168. Content Model 22.214.171.124.1. Mixed Content 126.96.36.199.2. Element Content 188.8.131.52.3. Content Particles 184.108.40.206. The ANY Keyword 220.127.116.11. The EMPTY Keyword 1.2.3. SGML: Elements 18.104.22.168. Multiple Element Types 22.214.171.124. Omitted Tag Minimization 126.96.36.199. Exceptions 188.8.131.52. Declared Content 184.108.40.206. Mixed Content 220.127.116.11. Unordered Content 1.3. Attributes 1.3.1. Attribute Type 18.104.22.168. String Type 22.214.171.124. Tokenized Types 126.96.36.199. Enumerated Types 188.8.131.52.1. NOTATION Attributes 1.3.2. Default Value 184.108.40.206. Literal Values 220.127.116.11. Keywords 1.3.3. Multiple Declarations 1.3.4. SGML: Attributes 18.104.22.168. Attribute Types 22.214.171.124. Attribute Default Values 126.96.36.199. Multiple Attribute Definition Lists 188.8.131.52. Global Attributes 1.4. Entities 1.4.1. Entity Location 1.4.2. Entity Definitions 1.4.3. Entity Boundaries 1.4.4. SGML: Entities 184.108.40.206. Default Entity 220.127.116.11. External Identifiers 18.104.22.168. Data Text 22.214.171.124. External Entity Types 1.5. Notations 1.5.1. Notation Declarations 1.5.2. SGML: Notations 126.96.36.199. Data Attributes 1.6. Conditional Sections 1.7. Processing Instructions 1.7.1. Why bother with Processing Instructions? 1.7.2. SGML: Processing Instructions 188.8.131.52. PI Entities Chapter 2. Model DTDs 2.1. Reading about the Model DTDs 2.1.1. Sample Documents 2.2. A Note on Using Industry-Standard DTDs 2.3. The Five Model DTDs 2.3.1. ISO 12083 184.108.40.206. Background 220.127.116.11. Quick Tour 18.104.22.168.1. What's on Top? 22.214.171.124.2. What's in the Middle? 126.96.36.199.3. What's on the Bottom? 188.8.131.52. Sample Document 184.108.40.206. Availability 2.3.2. DocBook 220.127.116.11. Background 18.104.22.168. Quick Tour 22.214.171.124.1. What's on Top? 126.96.36.199.2. What's in the Middle? 188.8.131.52.3. What's on the Bottom? 184.108.40.206. Sample Document 220.127.116.11. Availability 2.3.3. Text-Encoding Initiative (TEI) 18.104.22.168. Background 22.214.171.124.1. Full TEI 126.96.36.199. Quick Tour 188.8.131.52.1. What's on Top? 184.108.40.206.2. What's in the Middle? 220.127.116.11.3. What's on the Bottom? 18.104.22.168. Sample Document 22.214.171.124. Availability 2.3.4. MIL-STD-38784 (CALS) 126.96.36.199. Background 188.8.131.52. Quick Tour 184.108.40.206.1. What's on Top? 220.127.116.11.2. What's in the Middle? 18.104.22.168.3. What's on the Bottom? 22.214.171.124. Sample Document 126.96.36.199. Availability 2.3.5. Hypertext Markup Language (HTML 4.0) 188.8.131.52. Background 184.108.40.206. Quick Tour 220.127.116.11.1. What's on Top? 18.104.22.168.2. What's in the Middle? 22.214.171.124.3. What's on the Bottom? 126.96.36.199. Sample Document 188.8.131.52. Availability Part 2: Principles of DTD Analysis Chapter 3. Ease of Learning 3.1. DTD Size 3.1.1. Logical Units 184.108.40.206. Examples from the Model DTDs 3.1.2. Learning Requirements 220.127.116.11. Examples from the Model DTDs 3.2. DTD Consistency 3.2.1. Naming 18.104.22.168. Examples from the Model DTDs 3.2.2. Parallel Design 22.214.171.124. Examples from the Model DTDs 3.2.3. Element-Type Classes 126.96.36.199. Examples from the Model DTDs 3.2.4. Global Attributes 188.8.131.52. Examples from the Model DTDs 3.3. DTD Intuitiveness 3.3.1. Naming 184.108.40.206. Examples from the Model DTDs 3.3.2. Structure 220.127.116.11. Examples from the Model DTDs Chapter 4. Ease of Use 4.1. Physical Effort 4.1.1. Content Models 18.104.22.168. Examples from the Model DTDs 4.1.2. Attribute Definitions 22.214.171.124. Examples from the Model DTDs 4.2. Choice 4.2.1. Limiting Choices 126.96.36.199. Examples from the Model DTDs 4.3. Flexibility 4.3.1. Descriptive and Prescriptive DTDs 188.8.131.52. Examples from the Model DTDs 4.3.2. Inline Element Types 184.108.40.206. Examples from the Model DTDs 4.3.3. Role Attributes 220.127.116.11. Examples from the Model DTDs 4.3.4. Generic Element Types 18.104.22.168. Examples from the Model DTDs Chapter 5. Ease of Processing 5.1. Predictability 5.1.1. Constraint 22.214.171.124. Examples from the Model DTDs 5.1.2. Recursion 126.96.36.199. Examples from the Model DTDs 5.1.3. Generic Element Types and Role Attributes 188.8.131.52. Examples from the Model DTDs 5.1.4. Authors' Modifications 184.108.40.206. Examples from the Model DTDs 5.1.5. SGML: Placement of Data and Subdocument Entities 220.127.116.11. Examples from the Model DTDs 5.2. Context 5.2.1. Containers 18.104.22.168. Examples from the Model DTDs 5.2.2. Implied Attribute Values 22.214.171.124. Examples from the Model DTDs 5.3. DTD Analysis: Final Considerations Part 3: Advanced Issues in DTD Maintenance and Design Chapter 6. DTD Compatibility 6.1. Structural Compatibility 6.1.1. Repetition 6.1.2. Omissibility 6.1.3. Alternation 6.1.4. Changes in Combination 126.96.36.199. Changes to the Same Content Token 188.8.131.52. New Element Types 6.1.5. ANY and EMPTY 6.1.6. Attribute Compatibility 184.108.40.206. Repetition 220.127.116.11. Omissibility 18.104.22.168.1. Changes to Default Value 22.214.171.124. Alternation 126.96.36.199. Typing 6.1.7. SGML: Structural Compatibility 188.8.131.52. Ordering 184.108.40.206.1. Ordering of Data 220.127.116.11. Repetition of Data 18.104.22.168. CDATA and RCDATA declared content 22.214.171.124. Inclusion and Exclusion Exceptions 126.96.36.199. Additional SGML Attribute Types 6.2. Lexical Compatibility 6.2.1. Entities 6.2.2. Whitespace 6.2.3. SGML: Lexical Compatibility 188.8.131.52. Markup Minimisation 184.108.40.206.1. Start-Tag Omission 220.127.116.11.2. End-Tag Omission 18.104.22.168. Record Ends Chapter 7. Exchanging Document Fragments 7.1. Editing Fragments as Stand-Alone Documents 7.1.1. Ancestors and Siblings 7.1.2. Cross-References 22.214.171.124. Changing IDREFs 126.96.36.199. Creating Placeholders 7.1.3. Entities 7.1.4. Summary 7.1.5. SGML: Stand-Alone Fragments 188.8.131.52. #CURRENT Attributes 184.108.40.206. Inclusion and Exclusion Exceptions 220.127.116.11.1. Inclusion Exceptions 18.104.22.168.2. Exclusion Exceptions 7.2. Reparenting in a Dummy Document 7.2.1. Ancestors and Siblings 7.2.2. Cross-References 7.2.3. Entities 7.2.4. Summary 7.2.5. SGML: Reparenting 22.214.171.124. Inclusion and Exclusion Exceptions 7.3. Using Subdocuments 7.3.1. Ancestors and Siblings 7.3.2. Cross-References 126.96.36.199. Simple External Reference: HyTime Scheme 188.8.131.52.1. HyTime Value Reference 184.108.40.206. Simple External Reference: XLL Scheme 7.3.3. Entities 7.3.4. Summary 7.3.5. SGML: Subdocuments 220.127.116.11. SUBDOC Entities 18.104.22.168. Inclusion and Exclusion Exceptions Chapter 8. DTD Customisation 8.1. Types of Customisation 8.1.1. Simplifying a DTD for Authoring 22.214.171.124. Eliminating Unnecessary Choice 126.96.36.199. Avoiding Markup Errors 8.1.2. Adding Element Types to a DTD 8.1.3. Restructuring a DTD's Components 8.2. Extension Mechanisms in the Model DTDs 8.2.1. Customising the DocBook DTD 8.2.2. Customising the TEI DTDs 188.8.131.52. Base and Auxiliary Tagsets 8.2.3. Customising the HTML DTD 8.2.4. Customising the MIL-STD-38784 DTD 8.2.5. Customising the ISO 12083 DTDs Part 4: DTD Design with Architectural Forms Chapter 9. Architectural-Forms Concepts 9.1. Meta-DTDs 9.2. Documents 9.2.1. Types of Architectural Forms 9.2.2. The Architectural Document 9.3. Practical Uses of Architectural Forms 9.3.1. DTD Extension 9.3.2. Software Reusability 184.108.40.206. A Common Book Architecture? 9.3.3. Multi-Use Documents 9.3.4. Extended Validation 9.4. Summary of Terminology Chapter 10. Basic Architectural-Forms Syntax 10.1. Setup and Configuration 10.1.1. Architecture Use Declaration Attributes 10.1.2. SGML: Original Syntax 10.1.2.1. Architecture Base Declaration 10.1.2.2. Architecture Notation Declaration 10.1.2.3. Architecture Entity Declaration 10.1.2.4. Architecture Support Attributes 10.2. Basic Forms 10.2.1. Deriving Elements 10.2.1.1. Element Form Strategies 10.2.2. Deriving Attributes 10.2.3. Deriving Notations 10.2.4. SGML: Basic Forms 10.2.4.1. Notation Forms Chapter 11. Advanced Architectural-Forms Syntax 11.1. Automatic Derivation 11.1.1. SGML: Automatic Derivation 11.2. Suppressing Architectural Processing 11.2.1. Suppressing Elements 11.2.2. Suppressing Data 11.2.3. SGML: Suppressing Architectural Processing 11.3. Architectural Attribute Values 11.3.1. Attribute Defaulting 11.3.2. Tokens 11.3.3. Deriving Content from Attribute Values 11.3.4. Deriving Attribute Values from Content 11.3.5. SGML: Architectural Attributes 11.4. Default Architectural Information 11.4.1. Creating a Default Notation 11.4.2. Resolving IDREFs 11.4.3. SGML: Default Architectural Information 11.5. Meta-DTDs 11.5.1. Meta-DTD Configuration 220.127.116.11. SGML: Meta-DTD Configuration 11.5.2. SGML: Meta-DTDs 18.104.22.168. Meta-DTD Quantities 22.214.171.124. General NAMECASE Substitution Back Matter Appendix A. Model DTDs: Index of Element Types and Attributes Appendix B. General Index
[Prepared by Robin Cover as part of the SGML/XML Web Page.]