[This local archive copy mirrored from the canonical site: http://www.cm.spyglass.com/doc/fpi.html; links may not have complete integrity, so use the canonical document at this URL if possible.]

Spyglass Cambridge - Technical Reference

Formal Public Identifier RoadTrip

    <!DOCTYPE html PUBLIC "-//IETF//DTD HTML 2.0 Strict//EN">

The poor, maligned DOCTYPE declaration. Simply trying to do its job and people seem to have such a time with it:

>What's complicated is all the "-//xxx/" stuff inside the DOCTYPE tag?
>Mere mortals cannot guess what is valid in there, and thus the safest
>things is to leave it out completely. The suggested thing to put there
>has changed so many times over the years and months that it is impossible
>for anyone to know for sure what to put. Who assigns that stuff anyway?
>Does IANA decide? W3C/ERB? You? Me? Captain Kangaroo? Who knows? Not I.

Well, I never thought about myself as more than mere mortal. Hmmm. Maybe I should think about this...

...on closer examination, I'm not any smarter than the average monkey (and no hair on my back). This is not exactly rocket science. It's the label for the Document Type Definition (DTD) to which (theoretically) the document conforms. It's a required part of every HTML document (see the HTML specification, IETF RFC 1866). It's not just "for SGML"; it's what makes your document HTML. If you leave it out you no longer have an HTML document, you have a text document with some HTML tags in it.

Each piece of public text (such as a DTD) has a unique label called a Formal Public Identifier (FPI). The label for a particular DTD doesn't change one iota once it is published. In fact, that's the whole point. If the text changes, there should be a new, unique FPI assigned to it.

Rev' up your engines, we're going on an FPI road trip!

NOTE: The FPI's double solidus (forward slash) are field delimiters.

              1    2       3   4    5   6                7
    <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0 Strict//EN" >
1. "HTML"
The SGML document type being declared: ...
2. "PUBLIC"
This means the following literal is a formal public identifier. Using SYSTEM here would mean the following literal was a system identifier (like a pathname, or even a URL on systems that support it (like SP)).
3. "-"
A minus sign means an unregistered organization. ISO, registered (+) or unregistered (-) are possibles here. IETF for whatever reason isn't registered.
4. "IETF"
This is the unique owner identifier or ownerID. This is the party responsible for creation/maintainance of the object/document/etc. If the DTD comes from IETF, W3C, etc. you'll see their ownerID here.
5. "DTD"
This describes the type of object, called a Public Text Class. It might be a DTD, a file full of entity declarations, a text document, etc. There's a bunch of possibilities; DTDs always get a "DTD" (what else?)
6. "HTML 2.0 Strict"
This is the Public Text Description, which describes the public text. Each piece of public text within the domain of its ownerID must have a unique public text description. Here you'll find the object's name, plus flavors such as version numbers, "strict", etc.
7. "EN"
This is called Public Text Language, describing the natural language in which the public text is written. This is the two, uppercase-only characters from ISO 639. In this case it is "EN" (English).

Any HTML book worth reading (and *any* SGML book) will discuss this stuff in detail. Martin Bryan's "Author's Guide to SGML" [BRYAN] is good if you're not looking to spend too much money. If you want to fork out US$100, get Goldfarb's "SGML Handbook" [GOLD90] and begin your intense meditation on words from the prophet hisself.

--Murray


[GOLD90]
"10.3 Comment Declaration" (pp.390-1), Goldfarb, Charles F., "The SGML Handbook", Yuri Rubinsky, Ed., Oxford University Press, 1990.
[BRYAN]
"SGML: An Author's Guide to the Standard Generalized Markup Language", Bryan, Martin, Addison Wesley Publishing Company, 1988.

Return to Spyglass Technical Reference


Spyglass, Inc. | Cambridge Spyglass


©1997 Spyglass, Inc. All Rights Reserved.