The following file with Foreword, Preface, and Table of Contents is available via FTP from world.std.com
PRACTICAL SGML, 2nd edition by Eric van Herwijnen FOREWORD In the past three years since the initial publication of Practical SGML by Eric van Herwijnen, the computer industry has seen a dramatic increase in the use and acceptance of SGML and many of the concepts derived from it. Much of this growth can be attributed to the work of Eric and a small army of other experts and professionals who have educated end-users, application programmers, managers, and others on SGML and the inherent benefits of those who work with it. This latest version of Practical SGML is another step in this process, with greater emphasis and focus on helping the novice work his way through the vast amounts of information required to become proficient in SGML: * the tools currently on the market that enable the easy creation of SGML data and the use and distribution of that data in a variety of forms; * the minimum amount of information needed by people who wish to understand and use ISO 8879; * aids and information on how to stay current with the volumes of material written on SGML in publications throughout the world; * practical examples of the many SGML constructs and guidelines on their appropriate uses; * other helpful hints and insights based on years of working with the standard and integrating it into a complex and challenging computer environment. This book is both practical and vital for anyone who needs an introduction to the many facets of SGML and how it fits into an organization, either in the government, corporate enterprises, or industry groups. Organizations throughout the world are recognizing the need for international standards and open systems as they build computer systems and networks employing applications, hardware, network proto cols, operating systems from a multitude of computer software and hardware manufacturers. In addition, the requirement to develop, access, and reuse corporate information as a key corporate asset has become a predominant motivating factor in the industry. SGML has played a central role in this development in the past several years and will continue to play a more central role in the years to come. The emergence of on-line information and information by-products (including multi-media applications) require s the diversity and exchange of content identification that SGML enables. With the second edition of Practical SGML Eric will be training the new generation of SGML experts who are needed to help their organizations improve their productivity and competitiveness. These days no one can ignore the SGML standard for who knows what your competitors are doing! Sharon Adler and Anders Berglund Boulder, Colorado October, 1993 PREFACE TO THE SECOND EDITION During the past 30-40 years we have seen an enormous growth in all areas of computer applications. Initially, computers were mainly used by scientists to do numerically intensive calculations ("number crunching"). Now they have found their way into homes and offices. Companies equip all staff, managers, and secretaries with powerful personal computers instead of typewriters. Computers are applied more and more in areas of human communications, particularly those concerned with text processing. This is a natural evolution, encouraged by the availability of cheap and user-friendly micro computers. Despite the obvious benefits t here are some frustrating problems associated with the use of text processing systems. Partly for competitive reasons, partly for functional reasons, the formats used by computer manufacturers are often incompatible. Data which are processed by one system cannot be used on another. Storing text in a machine-readable form raises expectations that cannot always be met. It is hard to explain to an author that text, which exists on a computer, needs to be retyped in a different format. Another problem with electronically stored information is that it is difficult to understand and retrieve. In areas of professional computer use - for example, programming languages - the emphasis on portability through standardization has existed for a relatively long time. Only quite recently, in October 1986, the International Organization for Standardization (ISO) issued a standard for document representation: SGML, the Standard Generalized Markup Language (International Standard 8879) which immediately attracted much attention. This ISO standard explains how documents may be split into a part containing the text and a part describing its structure without reference to a particular word- or text-processing system. SGML conforming documents can be interchanged and processed on many different systems in many different ways. Programs can analyze SGML texts because their structure is clearly indicated. Hence computers can manage large amounts of complex data and provide easier access to these data. Traditionally, the only dimension of text is the paper it is printed on. Perhaps the most important property of SGML is its ability to add a new dimensionto information, since the latter becomes independent of the medium. This permits new kinds of processing. Storage is no longer restricted to paper, but could be in different forms such as in a database or on optical media. Retrieval facilities may be used that cannot be applied to unstructured text. If there ever will be "paperless offices," SGML will play an important role in them. During the lifetime of the first edition of this book, SGML has become widely accepted and is becoming more widely used. Since it is becoming so important, I felt it was worthwhile to do a major revision. A complete re-write should enable Practical SGML to withstand the test of time and make it the definitive introductory book about SGML. Two major points of criticism of the first edition were that the book was not yet simple enough to be given to complete novices, and that for detailed points it was not precise enough. I have therefore tried to present the minimum information about SGML as directly as possible. The book was also criticized as being a "book for programmers," and although it has been simplified in many places, I should point out that document analysis and writing DTDs are very akin to programming. There are four parts to this book. Part I, Getting Started with SGML, explains what SGML is, how to use it, and what kind of software is needed. It is written for beginners and does not touch on any programming aspects of SGML. Part II, Writing DTDs, explains document analysis, DTD design, markup declarations, an overview of available DTDs, and tips for writing DTDs. I have introduced structure diagrams as an intermediate step which should make writing DTDs easier for non-progra mmers. Parts I and II contain the minimum information that is required for using standard SGML. Part III, Customizing SGML, explains advanced concepts such as the SGML declaration, minimization, notation, short references, marked sections and ambiguities. It is intended for anyone who is interested in the more subtle features of SGML, or who needs to customize SGML because its default functionality is not adequate. Part IV, Special Applications, contains some examples of the application of SGML to EDI (Electronic Document Interchange), mathematics, and graphics. SGML is part of a suite of ISO standards called "Information Processing - Text and Office Systems." This suite includes related standards such as the Hypermedia/Time-based Structuring Language (HyTime), the Document Style Specification and Semantics Language (DSSSL), the Standard Document Interchange Format (SDIF), the Standard Page Description Language (SPDL), and the Fonts standard. The final chapter contains an introduction to these standards. Exercises throughout the text allow you to test your understanding. The answers are given in Appendix A. In Appendix B I explain how to interpret the output of the public domain sgmls parser. I do not address the LINK, CONCUR, and SUBDOCument features. The first edition contained descriptions of a number of SGML products, which I have removed to avoid a too rapid outdating of the book. Wherever appropriate, I included the output of the public domain sgmls parser. I choose this parser to remain independent of any commercial bias. It should not be seen as a value judgment on behalf of this or other parsers. TABLE OF CONTENTS Foreword Preface to the second edition Acknowledgments Conventions and definitions Part I. GETTING STARTED 1. INTRODUCTION 1.1 The problem with today's word processors 1.2 The solution: SGML 1.3 When should you use SGML? 1.4 Some myths about SGML 1.5 CALS 1.6 Exercises 1.7 Bibliography for Chapter 1 2. A BRIEF HISTORY OF SGML 2.1 Traditional markup 2.2 Electronic markup 2.3 Specific markup 2.4 Generic markup 2.5 Exercises 2.6 Bibliography for Chapter 2 3. COMPONENTS OF AN SGML SYSTEM 3.1 The three parts of an SGML document 3.2 The parts of an SGML installation 3.3 Bibliography for Chapter 3 4. DOCUMENT TYPE COMPONENTS 4.1 Exercises 4.2 The document type definition 4.3 Exercises 4.4 Markup defined in the DTD: elements 4.5 Exercise 4.6 Markup defined in the DTD: attributes 4.7 Exercise 4.8 Markup defined in the DTD: entities 4.9 How to refer to a DTD? 4.10 Processing instructions 4.11 Bibliography for Chapter 4 5. CREATING SGML DOCUMENTS 5.1 Why use an SGML editor? 5.2 SGML editor checklist 5.3 Bibliography for Chapter 5 6. HOW TO KEEP UP TO DATE WITH SGML 6.1 The SGML User's Group 6.2 The GCA 6.3 Books, magazines 6.4 The network 6.5 Bibliography for Chapter 6 PART II. WRITING A DTD 7. DOCUMENT ANALYSIS 7.1 The area of applicability of the DTD 7.2 A strategy for the DTD 7.3 The name 7.4 The logical elements in the document class 7.5 Elements or attributes? 7.6 The tree structure 7.7 Exercise 7.8 Bibliography for Chapter 7 8. STRUCTURE DIAGRAMS 8.1 Seven types of structure diagrams 8.2 Example of structure diagrams 8.3 Exercise 9. MARKUP DECLARATIONS 9.1 Names 9.2 Name tokens 9.3 Numbers 9.4 Number tokens 9.5 Groups 9.6 Model groups, occurrence indicators, and connectors 9.7 Connectors 9.8 Occurrence indicators 9.9 Model and name groups 9.10 Name token groups 9.11 Exercises 10. ELEMENT DECLARATIONS 10.1 Content models 10.2 Included elements 10.3 Excluded elements 10.4 ANY content 10.5 Declared content 10.6 Other markup 10.7 A mixture 10.8 Exercises 11. ATTRIBUTE DECLARATIONS 11.1 Unique identifiers and cross-referencing 11.2 Exercises 12. ENTITY DECLARATIONS 12.1 Parameter literals 12.2 Default entity 12.3 External entities 12.4 Data text 12.5 Bracketed text 12.6 Parameter entities 12.7 Bibliography for Chapter 12 13. PUTTING THE DTD TOGETHER 13.1 Public text owner identifiers 13.2 The DOCTYPE declaration 13.3 The DOCTYPE declaration subset 13.4 Comments 14. SOME ADVICE ON DTDS 14.1 Choosing a DTD 14.2 Tips for writing DTDs 14.3 Pitfalls 14.4 Exercises 14.5 Bibliography for Chapter 14 PART III. CUSTOMIZING SGML 15. THE SGML DECLARATION 15.1 The document character set 15.2 Capacity 15.3 Exercises 15.4 Scope 15.5 Syntax 15.6 Application specific information 15.7 The system declaration 15.8 Bibliography for Chapter 15 16. SGML FEATURES 16.1 Minimization 16.2 Exercise 16.3 Formal 17. NOTATION 17.1 Data content notation 17.2 Using NOTATION to describe mathematics 17.3 Using NOTATION to describe graphics 17.4 Exercise 17.5 Data attributes 18. MARKED SECTIONS 18.1 Marking sections as IGNORE and INCLUDE 18.2 Marking sections as CDATA or RCDATA 18.3 General advice on using marked sections 19. SHORT REFERENCES 19.1 Use of short references 19.2 Definition of short references 19.3 Example of the use of short references 19.4 Limitations of short references 19.5 Exercise 20. RECORD BOUNDARIES AND AMBIGUITIES 20.1 Treatment of record boundaries 20.2 Ambiguity type 1 20.3 Content models with OR and * connectors 20.4 Mixed content models 20.5 Bibliography for Chapter 20 PART IV. SPECIAL APPLICATIONS 21. SGML AND EDI 21.1 What is EDI? 21.2 EDIFACT 21.3 The standard commercial invoice 21.4 Bibliography for Chapter 21 22. SGML AND MATHEMATICS 22.1 Why describe mathematics with SGML? 22.2 Characteristics of mathematical notation 22.3 Who performs the markup of math? 22.4 Feasibility of S-type notation 22.5 Some problems with existing mathematics DTDs 22.6 Re-using mathematical formulas 22.7 The harmonized math effort 22.8 Conclusions 22.9 Bibliography for Chapter 22 23. GRAPHICS AND SGML 23.1 Bibliography for Chapter 23 24. OTHER ISO TEXT PROCESSING STANDARDS 24.1 SDIF 24.2 DSSSL 24.3 SPDL 24.4 FONTS 24.5 HYTIME 24.6 Conclusions 24.7 Bibliography for Chapter 24 A. SOLUTIONS TO THE EXERCISES B. THE SGMLS PARSER B.1 Bibliography for Appendix B C. THE ISO 646:1983 CHARACTER SET D. HOW TO READ ISO 8879:1986 GLOSSARY INDEX -------------------------------------ORDER FORM------------------------------ Ref: ftpser Please send me: Practical SGML, Second Edition, by Eric van Herwijnen _____copy(ies) HB, ISBN: 0-7923-9434-8 $58.00, Dfl 125.00, GBP 42.95 Payment enclosed to the amount of ___________________________ * Please invoice me * Please charge my credit card Name of Card Holder: ______________________________________ Card. no.: ________________________________________________ Expiry Date:______________________________________________ Am. Ex.* Visa* Diners Club* Mastercard* Delivery address: Name: ___________________________________________________________________ Address: ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ ________________________________________________________________ Date:________________ Signature:_______________________________ To be sent to: Outside North America In USA and Canada KLUWER ACADEMIC PUBLISHERS GROUP KLUWER ACADEMIC PUBLISHERS Order Dept. Order Dept, Attn: Eric Maki P.O. Box 322 101 Philip Drive 3300 AH Dordrecht, The Netherlands Norwell, MA 02061 Tel: +31-78-524400 Tel: 617-871-6600 Fax +31-78-524474. Fax: 617-871-6528 email: vanderlinden@wkap.nl email: prepub@world.std.com Orders from individuals accompanied by payment or authorization to charge a credit card account will ensure prompt delivery. Postage and handling charges will be absorbed by the Publisher on all such orders. Payment will be accepted in any convertible currency. Please check the rate of exchange at your bank. For sales within the Netherlands please add 6% VAT (BTW). Prices are subject to change without notice. * Delete those that do not apply.