What's more, companies spend from six to ten percent of their gross revenues putting information down on paper and distributing it. And although networking has made data sharing a reality, there are still many barriers to sharing and revising documents across platforms.
Today's leading organizations are finding ways to successfully reengineer their information management systems to take full, competitive advantage of the rich resources found in documents.
They are turning to open systems that will allow them to exchange information across applications, across platforms, and in a neutral format that will outlast today's technology and still serve them well decades from now. Which explains why more and more companies around the world are turning to SGML.
We invite and encourage further inquiry, either about SGML in general, or about the Interleaf solution. For more information complete the form or call 1.800.955.5323. Or write Interleaf, Inc., Prospect Place, 9 Hillside Avenue, Waltham, Massachusetts, 02154, Attn: Marketing Programs.
So what exactly is SGML, why is it important, and what should you know before choosing an SGML system?
As the world leader in document management software solutions, and in SGML systems, Interleaf has developed this guide to help answer your questions.
We believe that SGML will do for documents what SQL did for relational databases. SGML provides a way to structure information for easy access - and frees that information from the constraints of particular formats, applications, and computing platforms - so that the information can be used by any system.
Just as it was difficult to grasp the full ramifications of the benefits of computing two decades ago, the true power of SGML is only just beginning to be understood. But it is only a matter of time before SGML revolutionizes the way we all view documents and information in the years ahead.
SGML stands for Standard Generalized Markup Language. It is an encoding system that makes it possible to share and reuse the information in documents across software applications and across computing platforms.
SGML was designed to allow various applications to share documents while:
While traditional document conversion programs focus on preserving the document's format, SGML's goal is to preserve the content (the information) and the structure (the relationships among the data). This allows the information to be used and reused by both publishing and non-publishing applications.
SGML offers an open, flexible standard that can accommodate the specific needs of any industry or organization. It can also accommodate any degree of document complexity.
In the 1960's, a team of researchers at IBM was looking for a way for integrated law office information systems to share documents while maintaining a document's structure and content. The team, led by Charles Goldfarb, came up with a method called Generalized Markup Language (GML). Over the course of many years and through the efforts of many people and groups, GML gave rise to SGML.
In 1986, SGML was adopted as a standard of the International Standards Organization (ISO 8879).
SGML has since become the leading international standard for data and document interchange in open systems environments. It has widespread acceptance in the automotive, defense, commercial aviation, electronics and telecommunications industries.
Most of the information that is critical for your business is stored in your organization's documents. This is the information that is essential for giving your organization a competitive advantage.
This information has four characteristics.
With SGML, companies can easily access information throughout the organization, and can exchange vital information with customers and suppliers.
What kind of information do we mean? It can vary from company to company, and from industry to industry. But the universal characteristic is that it is vital to your company's ability to compete:
In fact, most industry analysts predict that SGML will be in place in a majority of organizations by 1995.
The definition of a "document" has expanded in recent years, thanks to computer technology.
It used to apply to information printed on paper - for instance, a memo, letter, marketing plan, RFP, bill of materials or customer invoice.
Today, a document is seen as a vessel of information. It can be electronic, and can contain numerous types of information - including text, graphics, data, spreadsheets, CAD, images, video and voice. As a "vessel," the document gives the information shape and structure, and makes the information understandable.
A document is the composite of information coming from multiple sources, brought together for the purpose of communicating.
SGML is a document management system that addresses the needs of business to access all of the critical information that is stored within the organization, in any kind of document, written in any kind of application, in any kind of operating environment.
There are two fundamental problems with using the information contained in documents:
There are three major benefits of SGML's open system approach. SGML enables:
Because you can access the contents of a document, getting the exact information you need and reusing it in the software of your choice, your organization can take advantage of a far more efficient system.
SGML:
Since its adoption as an ISO standard in 1986, SGML has rapidly become the worldwide standard for document modeling and interchange.
The decision to adopt SGML usually arises out of one of two trends: in a "top down" scenario, an industry or organization achieves a consensus on implementing SGML and decides what kind of applications they will develop and use, such as OSF and COSE. In regulated industries, this decision is often driven by the fact that they government requires information to be delivered in SGML.
SGML is also gaining widespread popularity within industries from the "bottom up," as the result of a number of individual organizations deciding that SGML can be beneficial, which creates a ripple effect within their profession or industry.
Today, virtually every industry can reap tremendous benefits by better managing the information previously trapped in documents.
In that time, Boeing has had to replace its publishing system four times, and the cost of converting the data for the manual has been greater than the cost of all the hardware and software.
Now, with SGML, the data will remain intact and independent of the systems used to publish the many manuals under the ATA-100 initiative. And the airlines that receive this manual can take Boeing's information and republish it in their own formats.
In a nutshell: by providing a unique competitive advantage.
SGML ensures instant access to business-critical information, improves the speed and ease with which this information moves through an organization, and makes the information accessible to the people who need it.
Here are just a few examples of how organizations are using SGML to gain an advantage in their industry:
In addition, Hewlett-Packard inspired others in the electronics industry (COSE) to adopt SGML because the neutral interchange could save costs on filtering data and the SGML structure allows for easier access to information for users. In fact, the first SGML application chosen by COSE was online help, developed by Hewlett-Packard.
If you can answer "yes" to any of the following questions, you should consider SGML:
Absolutely. An advanced, electronic document management system is critical to many of the applications needed for ISO 9000 certification.
Part of complying with ISO 9000 calls for a structured approach to documentation. SGML documents are structured so that they are interchangeable across organizations, and across the applications used in these environments.
By choosing state-of-the-art SGML software, you can be certain that your organization will achieve ISO 9000 certification.
A good document management supplier for SGML can help you structure and build a set of quality document templates, manage the creation and maintenance of those documents, and electronically distribute the requested documents to your suppliers, customers and throughout your organization. Most of all, a good SGML solution will allow your company to ensure quality and prove it according to the ISO 9000 standards.
For example, all documents must have a "legend," with authorizer, revision level and dates. A good SGML solution can help you enforce the style requirements for your tier 3 documents, using tools such as Interleaf's active document authoring aids for forms fill-in, validation and pick lists.
SGML is the first standard that brings the benefits of database technology to documents. SGML:
A database will take you only so far.
The truth is, only ten percent of all corporate information resides in databases - and much of that is transaction-based data.
Yes. For document-based information to become useful, it needs a structure or organization.
Much like the rows and columns in tabular data such as spreadsheets and relational databases, the document has to have identifiable parts. Unlike tabular information, however, documents are usually far more free-form and complex.
The good news is that documents do, in fact, have an implicit structure. Letters, for example, consist of a date, an address, a salutation or greeting, a body, and a signature. Books contain chapters, sections, subsections, and so on. In this guide, for example, each question has only one answer, but the answer can have several paragraphs.
The tricky part is that computers can't easily understand what is implicit for human beings. So a little human intervention is needed to help the computer read the parts of the document. On the plus side, documents allow us great freedom in organizing the contents.
SGML handles the simplest to the most complex structures, even within the same document. Examples include:
The SGML encoding scheme enables you to identify the logical elements of documents that are of interest to users, which basically means two things:
For instance, in this guide, the structure is that every question has an answer. The format is that the questions are in boldface type, and flush left with the left margin. With SGML, it doesn't matter if this format is changed so that questions are in italics or underlined. What matters is that we provide an answer for every question.
By remaining neutral when it comes to formatting, SGML allows you to present the same information in many forms and formats, such as on paper or online.
SGML separates format from contents. When you separate the content from specific application-based formats, you end up with "open" documents that can be reused in hundreds of ways.
The presentation or "format," is applied by the application that receives the information. This is where decisions are made to make headings bold, centered, etc. Some applications automatically apply formatting information to the SGML document, so you're always working in WYSIWYG (what you see is what you get). Not only does the document look better, but WYSIWYG helps the human eye spot irregularities in the structure of the document, e.g. subheads in tiny type or a paragraph that is in large type, bold and centered.
Users can share open documents among many different types of applications. Information can be published in a book with one format for paper output. The same information can be reformatted for use online, with perhaps larger type for screen display.
Users define an implementation of SGML to identify the important structural elements and that information can then be distributed throughout your organization, and to your customers and suppliers - regardless of the computing environments or applications they use. The structure provides a useful way to browse the document, locating the data you need.
Now, the information can be easily pinpointed and reused in a number of ways - and in any number of formats.
For instance, the automotive parts data can be published in a service manual. It could be automatically turned into an online electronic database for dealers with hypertext links already imbedded. Or it could be used to update a price book or company catalog.
By using SGML, you create modular parts of documents that can be easily shared over networks.
You can take very large, unrelated text and graphics and re-assemble them as you please.
Documents can be built dynamically out of networked databases of information. For example, you could pull together parts descriptions, CAD drawings and pricing information from three departments. Each is a tagged and included in an SGML document.
And because the SGML structure permits database mapping, it provides a consistent way to locate the data you need from a document. Significant objects (such as part numbers) can be assigned a unique object location reference in order to retrieve them. Objects can reside anywhere in databases, and can be accessed by referring to their unique identifiers.
In addition, because SGML is an open system, it can be used with multimedia data, programming code, SGML queries and just about any data type you might encounter now or in the future.
SGML offers a way to identify the structure and organization of each document, so that the elements can be extracted and used on their own as you need them.
By bringing the power of databases to documents, SGML offers great advantages in quickly finding and reusing even complex information from documents:
When an organization decides to adopt SGML, they develop a specific use for it. This is called an SGML "application." While specific applications will vary, every true SGML application contains these basics:
When users tag parts of a document, they can annotate the tags with whatever information they think is useful.
These annotations are known as attribute information, and serve as reminders or fill in other users with background descriptions about the document's structure.
Attribute information adds intelligence to the data, making it easier to reuse that data. Attributes can be specified in the DTD. They can provide information about the tag, such as identifying that tag for cross-referencing, or indicating which information applies to which customers, or naming a graphic entity that should accompany this tag.
With this added intelligence about the document, users can create webs of interrelated information.
For instance, a block of text might be tagged as a step within a procedure. If the tag has associated attributes that define the skill level required to perform the procedure, then that procedure could be made available only to qualified personnel. Likewise, some document elements could be designated only for certain security levels within your organization.
Attributes may differentiate the roles of a particular tag that may be useful for searches or presentations.
Attribute information is frequently used to further refine searches, and is unique to SGML. Attributes can be defined to explicitly state the criteria that will be used to retrieve information.
For example, a price book entry may have attribute information about the product's cost, manufacturer, availability, and the required security level of the reader.
Or information elements can be tagged for the skill level of the end user.
In printed documents, technicians would have to hunt through the pages for the information that are appropriate for their level - whether they're novices, intermediate, or advanced.
But for online documents, SGML attributes can tag the skill level so users can save time and go directly to the information that is appropriate.
A Tag Set is simply a list, or dictionary, of all the permitted document elements (or objects) from chapters, paragraphs, subparagraphs, down to individual words, letters and numbers.
Document elements are the basic building blocks of a document that identify the important information "objects," such as paragraphs, titles, tables, etc. These elements are usually defined by whoever is developing the SGML application.
This information may be assembled in any order that is appropriate for your class of documents. The Tag Set is listed in a hidden portion of the SGML document.
SGML allows you to identify document elements (paragraphs, titles, tables, etc.) as individual objects.
It then allows you to create a set of rules and relationships for structuring the information. This is known as the Document Type Definition (DTD).
For example: If creating an automotive parts document, you might specify that the document must have a chapter title, and that every part number must be immediately followed by a part-description paragraph. This is your DTD.
Any true SGML application that you create will understand and enforce these requirements. If the user does not adhere to the defined structure of the document, the SGML-enabled software product should query or alert the user.
The Document Type Definition (DTD) specifies the rules for the structure of the document.
Organizations have different guidelines for different types of documents, such as memos, proposals, user manuals, etc. Even though some documents may use some of the same element names (i.e. titles, paragraphs, tables), each type of document will have its own structure, and therefore, its own DTD.
For instance, your company's DTD might specify that every internal memo must include a date and the sender's email address. The DTD could also specify that a memo must contain the name of the person sending the memo, or the memo is not complete.
Reports, procedures, trouble-shooting manuals, and telephone books are all types of documents for which you could set up a DTD.
The way your company structures its documents will most likely differ from the way another company might do it. Because SGML is extremely flexible, every company can come up with its own DTDs - or use an industry structure that is similar to your organization's style.
You can specify which elements are required, which elements are optional, and which are repeatable. For example: a chapter must have a title, or a part number must be followed by a part description.
The DTD functions as a template which identifies the class of document - manual, memo, proposal - and the elements in that document, and the order in which they appear. The DTD can be general enough to apply to all documents in a class (i.e. all memos), and should specify the objects all memos have in common, such as the date, subject, addressee, author, and intro paragraph.
The rules in a DTD are set up so that they can be read by computers, and in a way by which a parser can confirm that the document instances tagged to the DTD follow the rules.
The DTD:
Not necessarily. You can think of the tag set as the total list of all the objects that can be used. It's like a dictionary - you may not use all the words, but you can see what's available and decide whether or not to use it.
One organization might have three different types of manuals which are all basically similar, but some elements that appear in one document may not appear in another. They could share common modules, such as tables or lists, but might differ in the content-specific tags.
The DTD specifies how each document gets assembled from its component parts.
The SGML Declaration establishes the environment in which the DTD and the document instances that are tagged to it operate.
It lets you specify the base character set used, the data encoding method (ASCII, EBCDIC, and so on), the maximum length of tag names, symbols used for tag descriptions, and other fundamental parameters.
SGML is a markup language for writing application DTDs.
SGML takes the contents of your text and "marks it up" with additional information or notes, such as that this paragraph is an "answer-para" with an attribute of "approved" and a value of "yes."
A DTD defines the structure. This is in contrast to the format markup that most text processing systems have imbedded in their text files. In those cases, markup refers to formatting information used by the software to present information to the screen or printer. That markup can only be understood by the product that created it.
With SGML, the content markup allows the system to keep track of what the object is, not what it is used for. This means that all the information about each object is self-contained and accessible to the information management system.
A Document Instance is the actual marked-up text that has been encoded by SGML.
In some cases, the Document Instance, the DTD and the SGML Declaration reside in the same physical file.
In other cases, the DTD and the SGML Declaration may be contained in a separate file that is referred to by a number of Document Instances that are of the same type.
A parser is a piece of software that compares the SGML documents to their DTD and determines whether the document meets the requirements of the DTD - whether it's syntactically correct, if all the required information is present and in the right order.
The parser also checks to see if the tags occur only where they are permitted, and that all internal references have their targets identified.
Yes. Because SGML is an open system, you can use SGML to model and create documents that consist of many different object types.
Illustrations, pictures, graphics, charts, multimedia information and even programming code can be brought into SGML as an external file or "entity reference" identified by notation.
SGML provides the organizational structures for referencing those non-textual data entities in their own notations, which may have their own formats. Any external notation is appropriate, including CCITT/4, TIFF, CGM, IGES and other files.
In fact, virtually any information you might encounter now or in the future can be incorporated.
Most formatting is proprietary, which makes it restrictive. This type of formatting includes typesetting codes, specific font names, line endings and page breaks. SGML ignores these formats, and focuses on the content and structure of the information.
Since SGML is a neutral encoding language, it leaves most formatting issues up to the specific software application.
Some format information is useful in SGML, however - that which transcends any particular display system, like specifying the number of columns in a table. SGML permits, but does not encourage, tags to have specific format significance.
SGML sees a document as a series of objects that get assembled according to a set of rules.
This is a typical approach in the relational database world, but unique in the world of documents.
When you apply SGML to a document, you identify the names of the objects and agree on what the objects are. These elements usually have attributes associated with them.
For example, this paragraph might be seen as a paragraph object. You might name it "Answer-para" and indicate that this paragraph is contained within a "Q & A" structure of the document. One of its attributes might indicate that this is "approved."
Object-oriented technology makes it possible to manage and assemble information elements in extremely small pieces. The more you increase the object-orientation and content identification of your documents, the more benefits you will be able to gain from your system.
The CALS table model is emerging as the de facto standard for encoding tables in SGML. Defining tables in SGML is a complicated procedure, but the CALS model has been refined over a number of years and has been adopted by many companies and industries, including the automotive industry, the ATA, and the OSF. It handles many types of tables that most customers want to create.
You'll know it is time when:
The Gartner Group, "What Is SGML and Why Is It Important?" April 19, 1993
It depends on your objectives.
Again, the biggest return is the competitive advantage that using SGML will give your organization.
With the increased speed and easier access to business-critical information that SGML offers, your organization can improve its products' time-to-market, raise the quality of customer support, increase customer satisfaction, shorten sales cycles and acquire new business before your competitors do.
Organizations who invest in SGML are already enjoying big payoffs in the following ways:
A lot depends on which SGML solution supplier you choose. The more expertise the supplier has in SGML technology, the easier the transition for your organization.
On your part, making the move to SGML requires some analysis of how your organization uses document information, and how you'd like to be able to reuse it in the future for more efficient processes.
To some extent, your company's MIS team will be involved in:
To benefit from SGML, you should use true SGML-compliant tools. If your applications are already SGML-compliant, then your networking and computing investments are in good shape. If not, auto-tagging is often successful in transforming existing documents into an SGML application.
If your current systems are not SGML-compliant, you will have to reengineer your processes and rethink how you use information as you prepare for SGML.
The good news is that the investments you make in preparing for SGML will be amply rewarded by the usefulness of SGML throughout your organization.
Many organizations have found that, by having to examine how they use information and structure documents, they are able to bring a new consistency and logic to their documents, and to greatly extend the use of information that was once limited to a single application.
With SGML, your organization can finally transport document-based information onto an open, global highway that will take you into the future.
The benefits of SGML will be shared by most of the users in your organization:
Some authors do not want to see SGML at all. With the right SGML products, users should require very little training. WYSIWYG SGML editing provides a friendly, easy-to-use interface while hiding the SGML details.
Others prefer to see and use SGML tools for navigation, manipulation and checking the structure of the information. To meet the needs of all authors, you should have the full range of tools for SGML.
It depends on which SGML solution supplier you choose. Your IS group will be involved to some extent in:
Absolutely not. With Interleaf solutions, you can make the transition to a full SGML system over time, at a pace that works for you and your company. In fact, most organizations will always have some non-SGML data in their systems. Therefore, it's important to make sure your SGML system is also good at handling the wide range of non-SGML data that will continue to exist in your organization.
You can run it through a document analysis and conversion software program which will bring your information, regardless of the source, into Interleaf's SGML system.
Interleaf's conversion programs can read any documents, in any formats. Compound documents can be split into textual and non-textual parts, with references from the textual to the non-textual. And you can choose to convert only the documents that you want to reuse or that will have a long life on an as-needed basis.
The following tips will help make your transition to SGML easier:
Adapting to an SGML system requires discipline and a change in some habits. But you can succeed if you take it one step at a time.
The best approach is to start small - don't try to do everything all at once. Plan carefully and stay with it. And choose a good SGML supplier - one with the breadth and depth of expertise to help you transition easily to SGML.
Once your document value is increased, your system can evolve painlessly to take advantage of the new information available.
The benefits will increase along the way, and eventually you'll be able to operate the ultimate SGML system to fully leverage your information investment.
Choosing the right supplier is the most important decision you can make when it comes to implementing SGML technology in your organization.
Above all else, your SGML supplier should have a proven track record of implementing document management systems.
And this should include the ability to handle complex applications as your system grows, which you can be sure it will.
Many vendors entering the field of document management systems are limited in their experience - whether it's in distributing documents or in some kind of document publishing application system.
What you need is a supplier with experience in managing documents. This requires an expert understanding of documents, which are a unique information form, and involves assembling a system from myriad information sources, automatically updating and controlling their discrete information components, and permitting intelligent viewing in the distribution.
There are several additional criteria that can help you in selecting the right supplier:
SGML Open has three purposes:
1. Educate the marketplace about the advantages of SGML
2. Provide information to help companies implement SGML
3. Create a forum where vendors can resolve issues in applying the standard to realworld applications
Since the company was founded more than ten years ago, Interleaf has led the electronic publishing industry with innovative document technologies - including ground-breaking products for SGML.
Interleaf offers a complete solution - from conversion to authoring, publishing, and distribution - for companies who seek to implement SGML and take full advantage of their organization's information.
Interleaf products are real, available right now to help solve your information management and distribution needs today. And Interleaf products are open and extensible, allowing your company to be self-sufficient and to take your information into the future as far as you want to go.
Interleaf advantages include:
Since 1988, Interleaf has been at the forefront of SGML technology, leading the way with products that facilitate the use of SGML to leverage information for companies.
We offer a full range of expert services to help your organization design and implement all or part of your SGML solution.
And Interleaf's expertise is available throughout the world. Interleaf support services are located in over 50 offices in the U.S., Canada, Europe, Australia, South America and Japan.
Interleaf's Professional Services Group works directly with our customers to provide:
In addition, Interleaf's Account Management Team works closely with the service professionals to understand your strategic vision, business values and goals.
Together, these professionals work as partners to ensure that your company receives the greatest value and return on your investment.
We hope that this guide has been helpful in providing an overview of what SGML is and how it can benefit your organization.
Of course, it's only the beginning.
If you would like to know more about SGML and how it can work for you, please give us a call at Interleaf. Our telephone number is 1.800.955.5323 (U.S. and Canada).