PRACTICAL SGML, 2nd edition: Foreword, Preface, and Table of Contents

The following file with Foreword, Preface, and Table of Contents is available via FTP from world.std.com



PRACTICAL SGML, 2nd edition

by Eric van Herwijnen



FOREWORD

In the past three years since the initial publication of 
Practical SGML by Eric van Herwijnen, the computer industry has seen
a dramatic increase in the use and acceptance of SGML and many of the concepts
derived from it. Much of this growth can be attributed to the work of Eric
and a small army of other experts and professionals who have educated end-users,
application programmers, managers, and others on SGML and the inherent benefits
of those who work with it.

This latest version of Practical SGML is another
step in this process, with greater emphasis and focus on helping the novice
work his way through the vast amounts of information required to become proficient in SGML:

* the tools currently on the market that enable the easy creation of
SGML data and the use and distribution of that data in a variety of forms;


* the minimum amount of information needed by people who wish to understand
and use ISO 8879;

* aids and information on how to stay current with the volumes of material
written on SGML in publications throughout the world;

* practical examples of the many SGML constructs and guidelines on
their appropriate uses;

* other helpful hints and insights based on years of working with the
standard and integrating it into a complex and challenging computer environment.


This book is both practical and vital for anyone who needs an introduction
to the many facets of SGML and how it fits into an organization, either in
the government, corporate enterprises, or industry groups. Organizations throughout the world are recognizing the need for international standards and open systems as they build computer systems and networks employing applications, hardware, network proto

cols, operating systems from a multitude of computer software and hardware manufacturers. In addition, the requirement to develop, access, and reuse corporate information as a key corporate asset has become a predominant motivating factor in the industry.

 SGML has played a central role in this development in the past several years and will continue to play a more central role in the years to come. The emergence of on-line information and information by-products (including multi-media applications) require

s the diversity and exchange of content identification that SGML enables.
With the second edition of Practical SGML Eric will be training the new generation of SGML experts who are needed to help their organizations improve their productivity and competitiveness. These days no one can ignore the SGML standard for who knows what

 your competitors are doing!

Sharon Adler and Anders Berglund
Boulder, Colorado
October, 1993

PREFACE TO THE SECOND EDITION

During the past 30-40 years we have seen an enormous growth in all areas
of computer applications. Initially, computers were mainly used by scientists
to do numerically intensive calculations ("number crunching"). Now they have
found their way into homes and offices. Companies equip all staff, managers,
and secretaries with powerful personal computers instead of typewriters.
Computers are applied more and more in areas of human communications, particularly those concerned with text processing. This is a natural evolution, encouraged by the availability of cheap and user-friendly micro computers. Despite the obvious benefits t

here are some frustrating problems associated with the use of text processing systems. Partly for competitive reasons, partly for functional reasons, the formats used by computer manufacturers are often incompatible.

Data which are processed by one system cannot be used on another. Storing
text in a machine-readable form raises expectations that cannot always be
met. It is hard to explain to an author that text, which exists on a computer,
needs to be retyped in a different format. Another problem with electronically
stored information is that it is difficult to understand and retrieve.
In areas of professional computer use - for example, programming
languages - the emphasis on portability through standardization has
existed for a relatively long time. Only quite recently, in October 1986,
the International Organization for Standardization (ISO)
issued a standard for document representation: SGML,
the Standard Generalized Markup Language (International Standard 8879) which
immediately attracted much attention. This ISO standard explains how documents
may be split into a part containing the text and a part describing its structure without reference to a particular word- or text-processing
system. SGML conforming documents can be interchanged and processed on many
different systems in many different ways. Programs can analyze SGML texts
because their structure is clearly indicated. Hence computers can manage large
amounts of complex data and provide easier access to these data.

Traditionally, the only dimension of text is the paper it is printed on.
Perhaps the most important property of SGML is its ability to add a new dimensionto information, since the latter becomes independent of the medium. This permits new kinds of processing. Storage is no longer restricted to paper, but could be in different 

forms such as in a database or on optical media. Retrieval facilities may be used that cannot be applied to unstructured text. If there ever will be "paperless offices," SGML will play an important role in them.

During the lifetime of the first edition of this book, SGML has become
widely accepted and is becoming  more widely used. Since it is becoming so
important, I felt it was worthwhile to do a major revision. A complete re-write
should enable Practical SGML to withstand the test of
time and make it the definitive introductory book about SGML.

Two major points of criticism of the first edition were that the book was
not yet simple enough to be given to complete novices, and that for detailed
points it was not precise enough. I have therefore tried to present the minimum
information about SGML as directly as possible. The book was also criticized
as being a "book for programmers," and although it has been simplified
in many places, I should point out that document analysis and writing DTDs
are very akin to programming.

There are four parts to this book. Part I, Getting Started with SGML, explains
what SGML is, how to use it, and what kind of software is needed. It is written
for beginners and does not touch on any programming aspects of SGML.

Part II, Writing DTDs, explains document analysis, DTD design, markup declarations, an overview of available DTDs, and tips for writing DTDs. I have introduced structure diagrams as an intermediate step which should make writing DTDs easier for non-progra

mmers. Parts I and II contain the minimum information
that is required for using standard SGML.

Part III, Customizing SGML, explains advanced concepts such as the SGML
declaration, minimization, notation, short references, marked sections and
ambiguities. It is intended for anyone who is interested in the more subtle
features of SGML, or who needs to customize SGML because its default functionality is not adequate. 

Part IV, Special Applications, contains some examples of the application
of SGML to EDI (Electronic Document Interchange), mathematics, and graphics.
SGML is part of a suite of ISO standards called "Information Processing -
Text and Office Systems." This suite includes related standards such as the
Hypermedia/Time-based Structuring Language (HyTime), the Document Style Specification and Semantics Language (DSSSL), the Standard Document Interchange Format (SDIF), the Standard Page Description Language (SPDL), and the Fonts standard. The final chapter

 contains an introduction to these standards.
Exercises throughout the text allow you to test your understanding. The
answers are given in Appendix A. In Appendix B I explain how to interpret
the output of the public domain sgmls parser.
 
I do not address the LINK, CONCUR, and SUBDOCument features. The first
edition contained descriptions of a number of SGML products, which I have
removed to avoid a too rapid outdating of the book. Wherever appropriate,
I included the output of the public domain sgmls parser. I choose this parser
to remain independent of any commercial bias. It should not be seen as a value
judgment on behalf of this or other parsers.

TABLE OF CONTENTS

Foreword
Preface to the second edition
Acknowledgments
Conventions and definitions

Part I. GETTING STARTED

1. INTRODUCTION 
1.1 The problem with today's word processors
1.2 The solution: SGML
1.3 When should you use SGML?
1.4 Some myths about SGML
1.5 CALS
1.6 Exercises
1.7 Bibliography for Chapter 1

2. A BRIEF HISTORY OF SGML
2.1 Traditional markup
2.2 Electronic markup
2.3 Specific markup
2.4 Generic markup
2.5 Exercises
2.6 Bibliography for Chapter 2

3. COMPONENTS OF AN SGML SYSTEM
3.1 The three parts of an SGML document
3.2 The parts of an SGML installation
3.3 Bibliography for Chapter 3

4. DOCUMENT TYPE COMPONENTS
4.1 Exercises
4.2 The document type definition
4.3 Exercises
4.4 Markup defined in the DTD: elements
4.5 Exercise
4.6 Markup defined in the DTD: attributes
4.7 Exercise
4.8 Markup defined in the DTD: entities
4.9 How to refer to a DTD?
4.10 Processing instructions
4.11 Bibliography for Chapter 4

5. CREATING SGML DOCUMENTS
5.1 Why use an SGML editor?
5.2 SGML editor checklist
5.3 Bibliography for Chapter 5

6. HOW TO KEEP UP TO DATE WITH SGML
6.1 The SGML User's Group
6.2 The GCA
6.3 Books, magazines
6.4 The network
6.5 Bibliography for Chapter 6

PART II. WRITING A DTD

7. DOCUMENT ANALYSIS
7.1 The area of applicability of the DTD
7.2 A strategy for the DTD
7.3 The name
7.4 The logical elements in the document class
7.5 Elements or attributes?
7.6 The tree structure
7.7 Exercise
7.8 Bibliography for Chapter 7

8. STRUCTURE DIAGRAMS
8.1 Seven types of structure diagrams
8.2 Example of structure diagrams
8.3 Exercise

9. MARKUP DECLARATIONS
9.1 Names
9.2 Name tokens
9.3 Numbers
9.4 Number tokens
9.5 Groups
9.6 Model groups, occurrence indicators, and connectors
9.7 Connectors
9.8 Occurrence indicators
9.9 Model and name groups
9.10 Name token groups
9.11 Exercises

10. ELEMENT DECLARATIONS
10.1 Content models
10.2 Included elements
10.3 Excluded elements
10.4 ANY content
10.5 Declared content
10.6 Other markup
10.7 A mixture
10.8 Exercises

11. ATTRIBUTE DECLARATIONS
11.1 Unique identifiers and cross-referencing
11.2 Exercises

12. ENTITY DECLARATIONS
12.1 Parameter literals
12.2 Default entity
12.3 External entities
12.4 Data text
12.5 Bracketed text
12.6 Parameter entities
12.7 Bibliography for Chapter 12

13. PUTTING THE DTD TOGETHER
13.1 Public text owner identifiers
13.2 The DOCTYPE declaration
13.3 The DOCTYPE declaration subset
13.4 Comments

14. SOME ADVICE ON DTDS
14.1 Choosing a DTD
14.2 Tips for writing DTDs
14.3 Pitfalls
14.4 Exercises
14.5 Bibliography for Chapter 14

PART III. CUSTOMIZING SGML

15. THE SGML DECLARATION
15.1 The document character set
15.2 Capacity
15.3 Exercises
15.4 Scope
15.5 Syntax
15.6 Application specific information
15.7 The system declaration
15.8 Bibliography for Chapter 15

16. SGML FEATURES
16.1 Minimization
16.2 Exercise
16.3 Formal

17. NOTATION
17.1 Data content notation
17.2 Using NOTATION to describe mathematics
17.3 Using NOTATION to describe graphics
17.4 Exercise
17.5 Data attributes

18. MARKED SECTIONS
18.1 Marking sections as IGNORE and INCLUDE
18.2 Marking sections as CDATA or RCDATA
18.3 General advice on using marked sections

19. SHORT REFERENCES
19.1 Use of short references
19.2 Definition of short references
19.3 Example of the use of short references
19.4 Limitations of short references
19.5 Exercise

20. RECORD BOUNDARIES AND AMBIGUITIES
20.1 Treatment of record boundaries
20.2 Ambiguity type 1
20.3 Content models with OR and * connectors
20.4 Mixed content models
20.5 Bibliography for Chapter 20

PART IV. SPECIAL APPLICATIONS

21. SGML AND EDI
21.1 What is EDI?
21.2 EDIFACT
21.3 The standard commercial invoice
21.4 Bibliography for Chapter 21

22. SGML AND MATHEMATICS
22.1 Why describe mathematics with SGML?
22.2 Characteristics of mathematical notation
22.3 Who performs the markup of math?
22.4 Feasibility of S-type notation
22.5 Some problems with existing mathematics DTDs
22.6 Re-using mathematical formulas
22.7 The harmonized math effort
22.8 Conclusions
22.9 Bibliography for Chapter 22

23. GRAPHICS AND SGML
23.1 Bibliography for Chapter 23

24. OTHER ISO TEXT PROCESSING STANDARDS
24.1 SDIF
24.2 DSSSL
24.3 SPDL
24.4 FONTS
24.5 HYTIME
24.6 Conclusions
24.7 Bibliography for Chapter 24

A. SOLUTIONS TO THE EXERCISES

B. THE SGMLS PARSER
B.1 Bibliography for Appendix B

C. THE ISO 646:1983 CHARACTER SET

D. HOW TO READ ISO 8879:1986

GLOSSARY

INDEX







-------------------------------------ORDER FORM------------------------------



Ref:  ftpser

Please send me: 
Practical SGML, Second Edition, by Eric van Herwijnen
_____copy(ies) HB, ISBN: 0-7923-9434-8 $58.00, Dfl 125.00, GBP 42.95

  Payment enclosed to the amount of ___________________________

* Please invoice me 

* Please charge my credit card 

  Name of Card Holder: ______________________________________  

  Card. no.: ________________________________________________

  Expiry Date:______________________________________________

     Am. Ex.*          Visa*           Diners Club*           Mastercard*

Delivery address: 

Name: ___________________________________________________________________

Address: ________________________________________________________________

         ________________________________________________________________

         ________________________________________________________________

         ________________________________________________________________


Date:________________     Signature:_______________________________

To be sent to:


Outside North America                         In USA and Canada

KLUWER ACADEMIC PUBLISHERS GROUP              KLUWER ACADEMIC PUBLISHERS 
Order Dept.                                   Order Dept, Attn: Eric Maki
P.O. Box 322                                  101 Philip Drive
3300 AH Dordrecht, The Netherlands            Norwell, MA   02061 
Tel: +31-78-524400                            Tel: 617-871-6600
Fax +31-78-524474.                            Fax: 617-871-6528
email:  vanderlinden@wkap.nl                  email: prepub@world.std.com

Orders from individuals accompanied by payment or authorization to
charge a credit card account will ensure prompt delivery. Postage and
handling charges will be absorbed by the Publisher on all such orders.
Payment will be accepted in any convertible currency. Please check the
rate of exchange at your bank. For sales within the Netherlands please
add 6% VAT (BTW). Prices are subject to change without notice.

* Delete those that do not apply.