The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
Advanced Search
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

Cover Stories
Articles & Papers
Press Releases

XML Query

XML Applications
General Apps
Government Apps
Academic Apps

Technology and Society
Tech Topics
Related Standards
Last modified: July 15, 1998
The SGML FAQ Book: Table of Contents


by Steven J. DeRose

Detailed Table of Contents

See also: Description  -   Endorsements

Table of Contents





Organization and Conventions

Typographic and other conventions

1. For Authors and Document Editors Using SGML Tools

1.1. What characters or sequences of characters can't I just type in content?
Details on when you need to avoid delimiter strings
Basic contexts where "<" needs to have a substitute
Other contexts for "<"
Other characters that must be substituted for
When to use substitutes
1.2. When must I quote attributes?
1.3. What entities can I use in attributes?
1.4. Can I use "/" in attributes?
1.5. Why did adding a "/" in content change my document structure?
1.6. My SGML comments somehow turn into errors. What's happening?
1.7. Why can't I type "--" inside a comment?
1.8. When can I use whitespace?
Whitespace inside markup
Whitespace inside quoted attributes
Whitespace in content
1.9. Can I prevent entity recognition in attributes?
1.10. Why aren't my cross-references working since I started using special characters?
1.11. Why don't I always put "&" in front of an entity name in an attribute?
1.12. What kind of entities can I mention on ENTITY attributes?
Interaction of ENTITY and NOTATION attributes
1.13. Why don't my special character ("SDATA") entities work?
1.14. I moved a section of my document and a lot of attributes changed values. Does this have something to do with "#CURRENT"?
1.15. I changed an attribute on one type of element and affected another one. What's happening?
1.16. Can I name a file on an attribute to link to it?
1.17. Can I include ">" in a processing instruction?
1.18. Why don't my processing instructions work in all SGML programs?

2. For Authors and Document Editors Who Commonly Deal With Raw SGML

2.1. Is "<>" really a tag?
2.2. When can I omit a start-tag?
Simple cases
Declared content elements
Empty (but not EMPTY) elements
Contextually required and optional elements
One last case: inferring containers
2.3. When may an end-tag be omitted?
2.4. What do "CDATA", "NDATA", "SDATA", and "#PCDATA" mean in SGML?
2.5. When are line breaks ignored and what does "ignored" mean?
RS and RE ignoring
RE ignoring
Proper and Improper Subelements
Summary for authors
Summary for technophiles
2.6. Why are my entities interpreted differently depending on where they are referenced?
2.7. What happens if I refer directly to an NDATA entity from within content?
2.8. What can I put inside elements with declared content CDATA or RCDATA?
2.9. What does a marked section end ("]]>") mean if it's outside of any marked section?
2.10. Can I put marked sections inside other marked sections?
2.11. Can marked sections cross elements?
2.12. Can entities cross or be crossed by element, tag, and other boundaries?
2.13. How do SUBDOC entities relate to documents that refer to them?
Setting up SUBDOC entities
SUBDOC and SGML fragments

3. For Data Conversion Specialists

3.1. How long does a USEMAP last?
USEMAP in the document instance
3.2. Does a NOTATION attribute let me use delimiters in content?
3.3. Can I have an end-tag for an EMPTY element?
Avoiding EMPTY problems
3.4. Why do I lose one more blank line each time I parse my document?
3.5. Why do some Japanese characters lead to parsing errors?
3.6. Is "&#13;" the same as "&#RE;"?
3.7. How many ways can I suppress markup recognition?
What must you escape?
3.8. Can I use global changes on my SGML?

4. For Authors And Editors Using External Data Or Modifying DTDs

4.1. Why can't I omit a tag? I looked in the DTD and the minimization flag is "O".
4.2. Can I have separate sets of ID attribute values for different element types?
Note on SGML's name spaces
4.3. Can I create an ID attribute that need only be unique within its chapter or other context?
4.4. Can I restrict an IDREF attribute to refer only to a certain element type(s)?
4.5. Can I have more than one ID attribute on the same element?
4.6. What if I need non-name characters in my ID attributes?
4.7. How can I organize large, complicated sets of ID values?
4.8. Can I add ID or other attributes to element types that are already in the DTD?
4.9. How do PUBLIC and SYSTEM identifiers for entities relate?
4.10. How can an SGML system find a file if all it has is a Formal Public Identifier?
4.11. What characters can I use in a Formal Public Identifier (FPI)?
Tables of character safety
Syntax of FPIs
4.12. How long can a Formal Public Identifier be?
4.13. What does the "owner identifier" part of a Formal Public Identifier actually identify?
4.14. What is the "language" for an FPI that points to data that isn't in a normal language?
4.15. What does the "version" part of an FPI mean?
4.16. How do I specify the FPI for a document in the document itself?

5. For Builders of SGML DTDs

On Elements

5.1. Can I set up a DTD so as to infer a container from only its title's start-tag?
Technophile's note on how tags are inferred
5.2. When was it I could use mixed content?
5.3. Which content models need parentheses?
5.4. When should I create an element type to group other elements?
5.5. When should I use inclusion exceptions?
5.6. How can I tell where spaces make a difference in elements?
5.7. I forgot to declare an element type, but named it in a content model. Why no error?
5.8. What's the difference between a title and a container?
5.9. Can I make an element that crosses element boundaries (in the rare case when I need to)?

On Attributes

5.10. When do I use an attribute versus an element?
Related HyTime issues
5.11. What do people mean by "element structure" and "document trees"?
5.12. I turned off minimization. Why are all my attributes wrong?
Impliable attributes
Unquoted attributes
5.13. Why can't I have duplicate enumerated ("name token group") attribute values?
5.14. Which attribute declared values mean something special, beyond restricting the character form of their values?
5.15. Do I need attribute default value #FIXED?
5.16. What about attribute default value #CURRENT?
5.17. What does default value #CONREF mean?
5.18. Why did setting a #CURRENT attribute on one kind of element affect an attribute on a different kind of element far later?
5.19. How can I let SGML know that parts of my content are in certain (human) languages?

On Entities

5.20. Why don't my special character entities work?
Numeric characters
Standardizing SDATA entities
5.21. What does the external identifier for a NOTATION identify?
5.22. How do the various kinds of entities relate, such as internal/external, NDATA, etc?
5.23. The parser says my parameter entity names are too long, but they're no longer than my other names. What's wrong?

6. For Builders of SGML DTDs Who Must Constrain Data In Special Ways

6.1. Can I create inclusion exceptions that don't affect the content model context when used?
6.2. How can I format run-in titles appropriately?
6.3. How can my software know which SGML element types cause word boundaries for formatting and indexing?
6.4. How can elements represent conditional text? Or do I need marked sections?
Using elements for conditional content
6.5. Can I choose between marked sections based on combinations of values?
Conditional expressions
Mutually exclusive marked sections
6.6. How can I require the author to have at least some content in an element?
6.7. How can I include or exclude #PCDATA?
6.8. How can I get attribute values to be inherited by subelements?
6.9. Can I require an attribute to be an integer, real number, date, truth value, or similar type?
6.10. Are there attribute declared values for constructs like FPI and inter-document link?
6.11. How do I link to an IDREF across books?
The two-attribute method
The complex SYSTEM/PUBLIC identifier method
The Text Encoding Initiative extended pointer method
The HyTime method
6.12. What does it mean if I refer to an entity but omit the semicolon?

7. For Builders of SGML DTDs and SGML Declarations

7.1. Why can't I change some delimiters without affecting others?
7.2. Why can't I set STAGO = TAGC? 1907.3. Why don't LIT, LITA, NET, and COM distinguish Open/Close like other delimiters?
7.4. Why can't I change some close delimiters when I can change related open delimiters?
7.5. What do the delimiters between GIs mean when I declare several elements at once?
7.6. Why do some delimiters not have names?
7.7. Why can I have only two forms of quote delimiters, and no alternatives for others?
7.8. Can name characters occur in delimiters?
7.9. What are "separators" in SGML?
7.10. Why can't SHORTREF delimiters contain the letter 'B'?
7.11. How can I tell whether my delimiter change creates a conflict?
7.12. What if I need to add a lot of name characters (like Kanji characters)
7.13. Why must all my names (and all SGML's built-in ones) be limited to 8 characters?
7.14. Why does extending my DTD by changing inclusions or content models cause line breaks to appear or disappear?
7.15. What is RANK?
7.16. Can I manipulate parts of elements, or only entire elements using the LINK features?

8. XML: A Simple, Standard Subset

8.1. Related work
Earlier efforts
SGML: It's 3, 3, 3 languages in one
The SGML Review
The W3C XML effort
8.2. Excerpts from the XML Working Draft.
Design goals for XML
2.2 Well-Formed XML Documents
2.8 White Space Handling
2.9 Prolog and Document Type Declaration
4.1 Logical and Physical Structures
4.3 Entity Declarations
8.3. XML Grammar Productions
8.4. Trivial text grammar for XML

Appendix A: Introduction to SGML

What is SGML, really?
Formatting and structure
Parts of an SGML document
Parts of an SGML document instance
The GI as major class identifier
Element declarations
Attribute declarations
Attribute types
Attribute defaults
Entities and notations
Parameter entities
External entities
Entities with special parsing constraints
The SGML declaration
A sample SGML document
The document type declaration
The document instance
A non-SGML equivalent
Optional SGML features
Minimization features
Other SGML features
Formal public identifiers

Appendix B: SGML Delimiters

Delimiter names, strings, and contexts
Delimiters recognized in each recognition mode

Appendix C: SGML Productions (Grammar Rules)

Technophile's note on grammar rules




DeRose, Steven J. The SGML FAQ Book: Understanding the Foundation of HTML and XML. Electronic Publishing Series, Number 7. Dordrecht/Boston/London: Kluwer Academic Publishers, 1997. Extent: xxiv + 250 pages, appendices. ISBN: 0-7923-9943-9. See the bibliographic entry for other details.

Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation


XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Globe Image

Document URI:  —  Legal stuff
Robin Cover, Editor: