[Mirrored from: http://www.hightext.com/IHC96/srn23.htm]

^<- ^{Steven R. Newcomb} ^-> It's All About Architectures

Steven R. Newcomb

Graphic Communications Association
Third International HyTime Conference
August 20, 1996
Seattle, Washington

^<- ^Architecture ^-> SGML has always been about information architecture.

Architecture means planning structures.

A structural plan is a model.

SGML requires modeling; one can't express a document instance in the absence of a model for it.

^<- ^{Document Type Definition} ^-> In SGML, a model is called a "Document Type Definition (DTD)"

A DTD is an information architecture.

(This is not news.)

HyTime was designed to extend SGML into the realm of hypermedia.

There are certain things about hypermedia that are commonly found in hypermedia documents.

HyTime provides a standard syntactic and semantic model for these common features of hypermedia documents.

There were several problems encountered in expressing the design of HyTime in strictly SGML terms.

But HyTime had to be an application of SGML, and not violate SGML in any way.

The main problem:

HyTime could not optimally be expressed as a DTD.

Such a DTD would either be too constraining for many applications, or
it would be so loose that the modeling power of DTDs would be lost.
Even the cleverest use of parameter entities just wasn't going to do the trick. Besides, parameter entities frequently fool everyone but computers.

What to do?

In other words,

HyTime's constructs are useful in many very different architectures (DTDs). We had to allow HyTime constructs to appear in any DTD. (We can't constrain their syntax much, if at all.)

But:

HyTime's constructs are only useful for interchange if the critical aspects of their syntax are used consistently and correctly. (We must constrain their syntax precisely and absolutely.)

What to do?

Two distinct kinds of architectures were needed!

(...and SGML already provided for one of them.)

(1) "Encompassing" architectures -- DTDs --

for control of entire document structure
for using the SGML parser as the enforcer
to guarantee conformance of document instance with a single authoritative model.

Two distinct kinds of architectures were needed!

(...and SGML already provided for only one of them.)

^Meta-DTD (2) "Enabling" architectures -- "meta-DTDs" --

for control of structure of certain aspects of document structure
using an architecture-specific "engine"
to guarantee conformance of only the architecture-relevant portions of the document instance to the architecture.

A DTD is the formal, parsable expression of an encompassing architecture.

A meta-DTD is the formal, parsable expression of an enabling architecture.

Meta-DTDs look and feel just like DTDs. (There are two minor enhancements to the meta-DTD syntax which Dr. Goldfarb will explain shortly.)

Question: So, what is the difference between a meta-DTD and a DTD?

Answer: There is NO significant inherent difference.

In fact, a DTD can be used as a meta-DTD without being changed. Every DTD is already potentially a meta-DTD. All you have to do is use it that way; you don't have to change a single character.

The only significant difference is in how they are used.

What makes a meta-DTD different is:

A meta-DTD serves as a partial meta-model for the document instance.
A DTD serves as a comprehensive direct model for the document instance.
A meta-DTD places certain semantic and syntactic constraints on document instances...
...IN ADDITION TO the constraints provided by the DTD.
A meta-DTD MUST be comprehensively explained in an auxiliary document --
- the "architecture definition document"
- -- written in natural language.
For any document instance,
there can be only one effective governing DTD,

but, at the same time,

there can be any number of meta-DTDs also governing the instance.
A meta-DTD may impose semantic and syntactic constraints which cannot be validated by SGML parsers.
Architecture engines normally should perform such validations on the output of the SGML parser.
HyTime's "reftype" constraints are an example.
The purpose of this talk is NOT to describe how enabling architectures work.
Several other speakers will do that at this conference.
- (Which is good, because there is no shortage of material to be covered.)

The rest of this talk makes some bold claims for this new enhanced way of using SGML: enabling architectures.

These claims are like a problem statement for the Technical Corrigendum of HyTime.

The solution is the Technical Corrigendum.

^<- ^{HyTime Technical Corrigendum} ^-> HyTime (ISO/IEC 10744) pioneered the idea of enabling architectures, just by being one.

Now, ISO 10744's Technical Corrigendum extends to everyone the ability to create and use enabling architectures.

EXCITING NEW POSSIBILITIES offered by ENABLING ARCHITECTURES

Obviously, we now have a way to express and enforce syntactic and semantic similarities between documents.

Any two element types in any two different documents can explicitly share syntactic and semantic features,

even if their DTDs are different,
even if they don't have the same generic identifier,
even if they have incompatible SGML declarations, and
even if they don't use the same character set.

Now we can declare that syntactic similarity between constructs in different instances are not a coincidence.

(DOCTYPE doesn't really do that.) (!)
SGML information becomes much more (and much more truly) self-describing.
Enabling architectures must be invoked with NOTATION declarations.
Many of today's DTDs will be promoted to meta-DTDs.

^<- ^{Extended Facilities} ^-> ^<- ^{SGML Extended Facilities} ^-> With ISO 10744's "SGML Extended Facilities," SGML is now, even more than ever, object oriented. This is not a coincidence.

SGML now has multiple inheritance.
An element can conform to any number of base element types, each of which is in a different base architecture.
^<- ^Element ^-> Elements and their semantics are now connected just as formally and rigorously as OO classes and their methods.
Enabling architectures (meta-DTDs) will become the primary arbiters of information interchange. (!)
Encompassing architectures (DTDs) will have vastly diminished architectural importance. Their primary purpose will be to allow the SGML parser to function.
Enabling architectures will define enterprise computing.
Every element that has enterprise-wide significance can have an attribute that expresses the meta-GI of that element for all purposes of the enterprise.
(Forward reference to following presentations: This is the "ArcForm" attribute. In the HyTime architecture, it's the "HyTime" attribute.)
Many enterprises will build and use enterprise-wide, application-neutral enterprise engines: the implementation of the enterprise's meta application.
"Plug compatible" enterprises.
Corporations can effectively merge their information resources, temporarily (for a project) or permanently.
Enabling architectures will facilitate and define "virtual corporations."
Human civilization is an enterprise, too.
HyTime is only the first SGML enabling architecture for civilization as a whole.
Many more SGML enabling architectures will be created and used.
Most will simply formalize, in SGML terms, the way things are currently done.
Some will be created more thoughtfully than that.
All will be subject to evolution.

Because of the rigor and accountability afforded by using the SGML enabling architecture formalism:

the evolution of architectures will be more orderly and less autocratic than some business interests might prefer.
That's good for all of us. The formalism can be used to protect the value of human effort, and, therefore, to enhance the productivity of everyone.
Some early candidates for enabling architectures (off the top of my head)
- locations on this planet
- law
- all the kinds of things EDI is used for now
- the flow of public money
- disclosures of various kinds
- multi-lingualism -- how to handle it
Some early candidates for enabling architectures (off the top of my head)
- professional certification requirements
- instructional materials
- IETMs
- treaties
- technical standards
Some early candidates for enabling architectures (off the top of my head)
- domain-specific knowledge in science & technology
- chemical formulae
- genetic sequences
industrial processes -- the SGMLification of STEP?
medicine
- personal health records
- public health information & data
- etc.
^<- ^{Topic Map} ^-> indexing and information discovery (Topic Maps!)
^<- ^SMDL ^-> music (SMDL!)
- opera, football games, and battle management. (Abstractly, these are pretty similar things.)

SIDE EFFECTS of using ENABLING ARCHITECTURES

The productivity of information architects will be:

higher, in terms of how much they can enhance the productivity of others; and
measured in new terms:
not just the number of documents that participate in an architecture, but
the number of architectures a document can participate in.
how few syntaxes are redundantly used to express the same or similar semantics;
how smoothly and painlessly evolution occurs;
how many architectures use as a base architecture successfully.

SGML architects will be expected to declare the sources of their semantic notions, and to express them as base architectures, rather than just leaving them implicit.

People who use SGML documents will increasingly rely on formal architecture definition documents, and decreasingly on comments in DTDs.

^<- ^Object ^-> Object-oriented software technology and SGML technology will be regarded as essential to one another.

^<- ^Object ^-> Objects are useful and application specific, while
^<- ^Element ^-> Elements are interchangeable and application neutral.
New applications will consist of aggregations of engines, of which SGML supporting software and HyTime engines are only two.
^<- ^Object ^-> Object-oriented software technology and SGML technology will be regarded as essential to one another.
The amount of application-specific software will be minimized by the use of architecture engines.
Validation of a document instance for conformance with its DTD will be seen as a trivial, automatic process, relegated to the status of a packet integrity check performed by communications technology.
Much greater emphasis will be placed on the document instance's conformance to its base architecture(s).
Less reliance on SGML parsers for validation work.
More reliance on architecture engines for validation.
DTDs will not be used as the primary means of insuring the compatibility of document structure with the expectations of application.
Document authors will gain control over all non-architectural aspects of the DTDs they use.
Document authors will use applications that permit them to tweak their DTDs interactively, probably while documents are open for writing.
This will greatly diminish the frustration and distaste many authors now feel about using SGML.
The author-driven evolution of DTDs will provide many good architectural ideas.
Even so, architects will have finer control, and more control, over the output of document authors than ever before.
Many new kinds of document validation will appear.
HyTime already provides some very general ones.
Other architectures may require and/or provide any kind of validation at all.
Architects can use these validation features to insure a quality product.
Authors can use architecture-provided validation features as writing aids. They will tweak DTDs (not meta-DTDs) to get more validation services from their architecture engines.
For example, using HyTime, by adding a lextype attribute.

EDITORIAL OBSERVATIONS

Managers need this technology and methodology.
Managers don't have to be programmers to understand it.
Given the right tools, managers don't have to be programmers to USE it, either.
Tools for supporting the tweaking of DTDs in conformance with their base architectures are needed, so that architects can let authors follow their consciences, without worrying about the consequences.
"Adapt or die."
Adaptability is health. Evolution must be demanded and supported.
The adaptability of the structure of corporate records is a useful indicator of the adaptability and overall health of the organization.
By means of enabling architectures, control over the evolution of information architectures can be distributed, and, whenever necessary, redistributed.

NEWS ABOUT ENABLING ARCHITECTURES

^<- ^CApH ^-> ^<- ^{Conventions for the Application of HyTime} ^-> CApH (Conventions for the Application of HyTime)
^<- ^{Topic Map} ^-> Topic Maps.
Common attributes for multilingualism.
^<- ^{Property Set} ^-> Semantic Declarations
Property Sets.
Access Policy (quid, pro, and quo)
Modification history
^<- ^SMDL ^-> ^{Standard Music Description Language} SMDL (Standard Music Description Language, ISO/IEC 10743)
^<- ^MID ^-> ^{Metafile for Interactive Documents} MID (Metafile for Interactive Documents)
^<- ^HTML ^-> HTML. Why not?

<- Steven R. Newcomb -> It's All About Architectures

<- Architecture -> SGML has always been about information architecture.

HyTime was designed to extend SGML into the realm of hypermedia.

The main problem:

What to do?

What to do?

(1) "Encompassing" architectures -- DTDs --

Meta-DTD (2) "Enabling" architectures -- "meta-DTDs" --

What makes a meta-DTD different is:

<- HyTime Technical Corrigendum -> HyTime (ISO/IEC 10744) pioneered the idea of enabling architectures, just by being one.

EXCITING NEW POSSIBILITIES offered by ENABLING ARCHITECTURES

<- Extended Facilities -> <- SGML Extended Facilities -> With ISO 10744's "SGML Extended Facilities," SGML is now, even more than ever, object oriented. This is not a coincidence.

Because of the rigor and accountability afforded by using the SGML enabling architecture formalism:

SIDE EFFECTS of using ENABLING ARCHITECTURES

The productivity of information architects will be:

SGML architects will be expected to declare the sources of their semantic notions, and to express them as base architectures, rather than just leaving them implicit.

People who use SGML documents will increasingly rely on formal architecture definition documents, and decreasingly on comments in DTDs.

<- Object -> Object-oriented software technology and SGML technology will be regarded as essential to one another.

EDITORIAL OBSERVATIONS

NEWS ABOUT ENABLING ARCHITECTURES

^<- ^{Steven R. Newcomb} ^-> It's All About Architectures

^<- ^Architecture ^-> SGML has always been about information architecture.

^Meta-DTD (2) "Enabling" architectures -- "meta-DTDs" --

^<- ^{HyTime Technical Corrigendum} ^-> HyTime (ISO/IEC 10744) pioneered the idea of enabling architectures, just by being one.

^<- ^{Extended Facilities} ^-> ^<- ^{SGML Extended Facilities} ^-> With ISO 10744's "SGML Extended Facilities," SGML is now, even more than ever, object oriented. This is not a coincidence.

^<- ^Object ^-> Object-oriented software technology and SGML technology will be regarded as essential to one another.