Eliot Kimber and Others on "The Power of Groves"

The following document collects several postings from February 2000 on the topic of HyTime/DSSSL "groves" [possibly: "Graph Representation of Property Values"]. These posts were sent to XML-DEV by Len Bullard, Eliot Kimber, Ken North (etc.) under the subject lines "Seeking a Dao of Groves" and "The Power of Groves." The initial posting in this collection by Eliot Kimber concerns HyTime (as well as Groves). The following posts all relate principally to Groves. Note especially the posting by Eliot Kimber of February 06, 2000 on the historic/architectural trappings of "groves."

For a more complete reference collection, see the section "Groves, Grove Plans, and Property Sets in SGML/DSSSL/HyTime". This topical section contains references to a number of shorter contributions by Paul Prescod, Alex Milowski, Eliot Kimber, Steve Newcomb, and others.

Date: Sun, 30 Jan 2000 18:44:11 -0600
From: Eliot Kimber <drmacro@drmacro.com>
To: Len Bullard <cbullard@hiwaay.net>
Cc: Robin Cover <robin@isogen.com>, XML Dev <xml-dev@ic.ac.uk>
Subject: Re: Seeking a Dao of Groves

Len Bullard wrote:

> Why?  Politics and personalities are what they are, power, tactics
> all of that, but they get us no closer to understanding HyTime. It
> is true that I was around a bit after hyTime emerged, and worked
> with its inventors.  It is true that by the time we got to the
> Vancouver conference, and after some experience, I no longer had
> the foggiest idea what Eliot and Steven were saying in their
> presentations.

And us you, Len :-)

> HyTime is brilliant, but brilliance blinds as well as illuminates.
> Sometimes the best position for a light is behind, above and slightly
> to the left.  So, a statement for finding the position:  standards
> derail, in my experience, when the problem to be solved by them is
> not (adequately understood | clearly stated | closed).  I am asking
> Dr. Newcomb, the only one besides Dr. Goldfarb on this list who
> was there at the beginning to verify or refute the following, and
> fill in the rationale.  I would be delighted if Dr Goldfarb would
> help.

Steve hasn't responded yet, so I'll take the liberty of providing some
answers as I believe I can do so with some authority.

> 1.  True or False:  hyTime started (little H deliberate as a music
> standard.  

True.  

             The problem(s) to be solved were synchronization and an
> application language for a musical notation.  What requirements of music
> made the hyTime designers move into a larger scope of standardization
> (Intergrated Open Hypermedia:  Bibliographic Model).

The story that I have been told by Dr's Goldfard and Newcomb is that it
was the U.S. DoD who said "this synchronization stuff is just what we
need for things like battlefield planning and management, but if we put
a music standard in our RFQs, we'll be laughed out of the Pentagon, so
can y'all do us a favor and pull the generic stuff out into a separate
standard? We'd be ever so grateful." Not being idiots, they did.

There was also, I think, the general realization by the committee that
the structures needed to do the synchronization required by music,
opera, ballet, and other forms of time-based stuff are in fact general
to a wide range of problems.  They could define a general facility and
then use it as the underpinnings of a more specialized music
representation standard (Standard Music Description Language, which
Steve and I are trying to finish up now, as a matter of fact).

> 2.  True of False:  There originally WAS a hyTime DTD.  Why was
> it abandoned?

True. It was abandoned because a "DTD for HyTime" didn't solve the
problem. You needed to be able to create your own documents, with your
own element types and semantics, that took advantage of the generic
linking and scheduling semantics that HyTime codified. The early drafts
used complex systems of parameter entities but it was quickly realized
that that wouldn't work, so they came up with the idea of architectures
as a way of mapping from specialized document types to the general types
defined by the HyTime standard.

This is, for example, exactly what XLink does, for the same reason.

HyTime is not, like HTML, simply about creating *a single way* to create
hypertext documents. It's about enabling *any document* to also be a
hypertext document. The second is inherently more complex than the
first, but also much much more powerful.  This is why XLink is cool but
also why it's more difficult to understand and implement than HTML.

> 3.  When at its most widely studied, HyTime included an exhaustive
> set of linking and location models.  At this point, the synchronization
> facility was expressed using these.  Why did linking and location become
> the dominant feature of HyTime?

I think linking and addressing became dominant for at least three
reasons:

1. Linking and addressing are of immediate benefit to almost all typical
SGML/XML applications (e.g., technical documentation in all its myriad
forms). Therefore the first applications of any part of HyTime were
going to be in linking applications. That was certainly what interested
me in HyTime initially (we needed to SGMLify the already sophisticated
linking in IBM's BookMaster application and HyTime seemed a very close
match to our existing requirements). Tools like SoftQuad Panorama/Synex
Viewport made it easy to use HyTime-based hyperlinking, at least in
simple ways.  Most SGML practioners already understood hyperlinking (or
at least the requirement for it), so it was easier for them to see how
HyTime could be of some value there.

2. Scheduling is much more involved than linking and much more difficult
to implement from scratch. In 1992, people were still struggling to get
industrial-strength SGML page-model presentation systems implemented.
What work was being done on the hypermedia parts of HyTime was being
done in universities by people like Lloyd Rutledge. It was also very
abstract and difficult to understand, certainly from the standard alone.
The pool of people who might be interested in it was small at best and
many, if not most, of them were already engaged in more immediate
concerns, like implementing the linking parts of HyTime.  This is still
largely the case, although things like SMIL are helping us to better
understand the problem space.  This is really an area where you have to
have a concrete application to really understand it or even be motivated
to implement it. The folks who undertand this part of HyTime best are
still largely engaged in putting bread on their table and getting some
basic infrastructure components implemented. But we are very close to
having the technology base we need to make implementing the scheduling
part of HyTime, if not easy, at least practical.

3. The people who have the most to gain from implementing the scheduling
stuff have the least ability to realize it: educators and archivists.
HyTime is ultimately about providing an interchange/archival
representation form for hypermedia information. This is of vital
importance to educators who need to be able to create rich information
presentations that are information system and presentation platform
independent (that is, apply to hypermedia the same benefits of generic
markup that technical information has enjoyed for years). It is of vital
importance to archivists (how many people realize that there exists
today *no standard way* to archive music except as print scores?). 
However, these are two groups of people who have little money to spend
on implementing standards like HyTime and, because they have little
money, little influence on companies that could implement it.

If you're Macromedia or Adobe, what financial motivation do you have to
implement HyTime? Your biggest customers make untold millions of dollars
selling the stuff they develop with your tools, so much profit that the
cost of authoring and reauthoring is noise, no matter what it costs,
because the authoring cost is a fraction of the total cost, such that
optimizing it further would provide little absolute benefit to the
business. Will you listen to the Disneys and Origins and Dreamworks of
the world or to the people at the Texas School for the Blind who want an
easy and sustainable way to make tommorrow's multimedia curriculum
usable by the visually impaired (how do you run a point-and-click
tutorial if you can't see the screen? How do you learn from it if you
can't hear the words? How do you run it if you are paralyzed from the
neck down?)?

Len and I know that Macromedia director would be a much more useful and
interesting tool if it could save information in a HyTime-conforming
format, but what motivation does anyone at Macromedia have to even learn
that fact, much less put it into practice? Absolutely none. At least
until the same legislators who are requiring both more multimedia
content in schools and fully-accessible materials for all students
realize that these two requirements cannot be met by current technology
and make the use of HyTime (or its functional equivalent) required by
law.

> 4.  True or False:  Groves are a concept borrowed from DSSSL, a
> style language, itself, originally that was altered to include
> Semantics.  

False. Groves are a concept that both DSSSL and HyTime needed in order
to be both compatible with each other and fully defined as standards.
Both DSSSL and HyTime had notions of abstract trees, but neither had
defined them formally. When, in 1995, it was realized that these two
standards could not be completed unless they were based on the same
basic model of what an SGML document is, the two working groups came
together to develop a common technical solution to the problem. Groves
were the result. The technical details came more or less equally from
Charles Goldfarb, James Clark, and Peter Newcomb, with important
contributions by the usual cast of suspects, including myself. 

[As far as I know, the DSSSL standard always included the word
"semantics" in its title (whatever that means--I'll leave that to Sharon
Adler to explain).]

  What requirement in a linking and location standard
> resulted in a unification with the DSSSL groves concept?

The requirement for a common underlying abstract model of what an SGML
document is. Both DSSSL and HyTime depend on the ability to do
addressing. DSSSL so that you can associate style with things, HyTime so
that you can relate things together. To do addressing, you must formally
define the structure of the thing being addressed. To do this you must
have a formalism for talking about structures. That's what groves are.
HyTime had the additional requirement that it had to be applicable to
*all data types*, not just SGML/XML (with groves, DSSSL can also be
applied to any type of data, but the standard as written does not
explicitly recognize this fact). Therefore, it was not enough to define
a formal model for SGML documents, we had to have a formalism from which
the abstract model for any kind of data (including SGML) could be
defined.

Another requirement was that both standards needed a *fundamentally
different view* of SGML documents that reflected the optimizations
required by their differing uses. In particular, DSSSL needed to see
processing instructions and individual characters while HyTime wanted to
ignore processing instructions (by default) and treat sequences of
characters as single objects for the purpose of addressing. 

This requirement is reflected in groves through the "grove plan", which
lets you say for a given grove type what things you do or don't want to
see at the moment. This provided a formal way in which DSSSL and HyTime
could both be based on the same SGML data model yet have different,
incompatible default views.  This is a complicating factor for groves
that could not be avoided (and turns out to be of tremendous utility in
practice).

> disaster.  So, yes, time for some simplifications.  Perhaps
> understanding the way another standard tried to solve the
> same problems is a clarifying experience.

I don't think it's necessarily time for simplificiations. What it is
time for is stepping back and providing some basic definitional
framework for all of the W3C specs. Ultimately, defining good
abstractions simplifies the system by centralizing design and
implementation effort and knowledge that would otherwise have to be
replicated everywhere it was needed. Every XML-related specification
needs a formal abstract data model for what XML (and possibly other
stuff) is, and to date each specification has either defined its own
(XLink/XPointer) or left it implicit (DOM). This makes the total much
more complicated than it needs to be.

The reason that the HyTime standard is so big is that it defines all of
the infrastructure needed by the *relatively simple* HyTime
architecture. That is, the linking and addresing parts of HyTime are
relatively simple, taking no more than 80 pages to define (comparable to
XLink, although bigger because it offers more facilities and more syntax
choices). But to define the facilities of HyTime with something
approaching mathematical precision, you need the following:

1. A standardized, generic, abstract data representation facility
(property sets and groves). You need this so that you can define
addressing and other processing semantics without reference to any
particular data type or implementation.

2. A standardized, generic facility for mapping from any document to the
syntax and semantics of the HyTime architecture (the Architectural Forms
Definition Requirements (AFDR) annex). 

3. A standardized definition of the abstract data model for SGML
documents, defined in terms of item 1 (the SGML Property Set annex).

4. A general architecture that defines those things the HyTime
architecture needs that are not specifically related to linking and
scheduling and that are useful to almost any SGML or XML application
(the General Architecture).

These are four of 6 annexes that make up the "SGML Extended Facilities".
The other two, Formal System Identifiers, and Lexical Type Definition
Requirements, are not strictly required in order to define the HyTime
architecture, but are useful nevertheless and represent long-standing
requirements on SGML.

Obviously, these should all be separate standards, published under
separate cover, and referenced from HyTime, but for historical reasons,
we did it backward and this is what we have. SC34/WG3 (the working group
responsible for the HyTime standard) has discussed doing this breakup at
some point in the future, but it's not being actively pursued at this
time because the people who would do it (me) are too busy with other
stuff just now.

I observe that the W3C is making *exactly the same mistake* we made in
not building the underlying necessary prerequisites first before trying
to do things like define abstract models for XML documents and generic
linking mechanisms. We have the excuse that we didn't know what we were
doing because nobody had done it before. The W3C doesn't have that
excuse.

We are seeing all sorts of problems with the various specs that stem
entirely from the lack of well-defined and agreed upon definitions for
fundamentals. When we published the XML spec we said, as a working group
"this spec is really not complete without a formal definition of the
abstract representation of XML documents" but we knew we couldn't afford
to delay the spec in order to do that [Remember that two of the people
chiefly responsible for the development of groves, James Clark and
myself, were founding members of the XML Working Group.] Our expectation
was that doing that definition would be the next order of business, in
large part because we knew from experience that both XLink and XSL
required it. Obviously, history took a different route and we are now
left where we are, with the DOM defining an API over an unspecified data
model, both XLink/XPointer and XSL defining their own hand-waved
abstract models, info set still being worked on, and schemas only just
now waking up to fact that they need an abstract data model as well.

EVERYONE INVOLVED SHOULD HAVE KNOWN BETTER. Many did. Lord knows I
brought it up at every opportunity.

I know that a lot of people got suckered into thinking that "XML is
easy" and that therefore doing things like XLink and XSL and XML Schema
will be easy too, in explicit contrast to HyTime, which is "hard". But
of course that's a Big Lie. To the degree XML was "easy" (in the sense
that it only took 18 months from the forming of the WG to publishing of
the Rec) it was because it required no invention, it only required
paring away those parts of SGML we really didn't need. That doesn't
require any grasping of complex abstractions or layers of mapping or
obtuse concepts like out-of-line links. But *everything after that
does*. Defining and understanding the abstractions needed to build a
complete system of standards is hard. It's hard intellectually, it's
hard politically, it's hard socially, it's hard from a business
perspective. A lot of people are simply not capable of working with or
understanding abstractions. I've known many crackerjack programmers who
could solve very difficult algorithmic and data structure problems who
could never grasp groves (and not for lack of trying). With HyTime, we
had the significant advantage that we were a fairly small group of
people who worked very well together. We had no commercial interests to
distract us. By contrast, the W3C is, by it's nature, a set of very
large groups of very diverse personalities and many competing commercial
interests. It should be absolutely no surprise that progress has been as
slow as it has been. In fact it's a surprise to me that any progress has
been made at all. [Jon Bosak and the W3C leadership have done some
admirable work in refining the W3C processes to try to work around some
of these inherent problems.]

I have no illusion that the world will some day wake up and embrace
HyTime '97 as it exists today. It's big. It's complex, it's hard to
grasp in many ways.  But I do fully expect that what we learned from
doing HyTime will eventually influence whatever gets put into practice
over the next 20 years. I do expect that people will realize that what's
in HyTime *is there because it has to be there* and that if they want a
system of standards that will serve them well for a long time (i.e.,
more than 5 or 10 years) that they will need to build those sorts of
things too. If they choose to borrow directly from HyTime, I will be
very pleased, but if they invent their own stuff, that's ok too. We
learn by doing. I certainly did.

I doubt that much of the infrastructure defined by the HyTime standard
will be used as is--it's too tied to SGML-specific ways of doing things;
it reflects the best thinking of 1995, not 2005. But there is a lot of
good stuff to be learned from what's there, a lot of valuable knowledge
and mistakes that can be had for the low low price of actually reading
the spec (and maybe asking some questions of gurus)
<ftp://ftp.ornl.gov/pub/sgml/wg8/document/n1920/html/>.

Steve mentioned that I resigned from the W3C in protest. That's true. I
also resigned because I had run out of patience trying to get people to
understand the difficult technical issues that we had spent the last 10
years coming to understand as we developed the HyTime and DSSSL specs. I
decided to take my experience and put it to practical use solving
immediate problems for clients who would not only listen to me but pay
me too!

Since that time I've been involved in several HyTime-based projects
where we've put all much of it into production and are continuing to do
so. I've helped Steve's company get their GroveMinder[tm] product into
production so we can use grove-based technology to solve people's
problems. I tried to build a demonstration HyTime engine and gave it
away (PHyLIS, www.phylis.com). Because I focus on large-scale systems,
most of what is being developed in the W3C is not even relevant to what
I'm doing at the moment, except to the degree that the information
produced is eventually emitted in XML. But because I already have HyTime
largely implemented, I don't have to wait for XLink to be finished or
for someone to implement XML Schema. But I also realize that I am, at
least today, fairly unique in this regard (although not as unique as you
might think, given that you can buy GroveMinder--I know that, for
example, ISOGEN's worthy competitor Calian has HyTime knowledge and
experience that is almost comparable to ours [they don't employ any
members of the HyTime editorial staff]). And there are people out there
who don't or can't talk about what they're doing with HyTime, but they
are out there.

I fully expect that the W3C's efforts to develop XML-based
specifications will generate important new insights into how to solve
basic problems in information management and, eventually, provide more
powerful tools than I have today and I look forward to that eagerly. But
I certainly don't lose any sleep because I don't have them today. In the
mean time, all I can see is a lot very bright and hard working people
working at an essentially futile task and spinning a lot of wheels, all
because the fundamentals are being ignored. It's a shame, but so is
world hunger and there's not much I can do about that either except to
make what small contributions I can.

Cheers,

E.

From eliot@isogen.com Thu Feb 10 11:27:44 2000
Date: Sun, 06 Feb 2000 15:03:03 -0600
From: "W. Eliot Kimber" <eliot@isogen.com>
To: xml-dev@xml.org
Cc: pandeng@telepath.com
Subject: Re: The Power of Groves

Steve Schafer wrote:

> I was rereading some old material on groves, and came across the
> following in a post by Eliot Kimber to comp.text.sgml (it was at the
> end of a paragraph discussing the definition of customized property
> sets for various kinds of data; the full context is available at
> http://xml.coverpages.org/grovesKimber1.html):
> 
> "However, there is no guarantee that the property set and grove
> mechanism is capable of expressing all aspects of any notation other
> than SGML."
> 
> (Notes 440 and 442 in section A.4 of the HyTime spec say much the same
> thing.)
> 
> On the face of it, this is a perfectly sensible thing to say. At the
> same time, however, it is rather disturbing, because it suggests that
> there might exist data sets for which the grove paradigm is wholly
> unsuited. I would certainly hate to expend a lot of effort building a
> grove-based data model for a data set, only to discover part way
> through that groves and property sets simply won't work for that data
> set.

The point of this statement is that we could not at the time *guarantee*
that groves could express all aspects of a given notation. In fact I'm
quite sure that, just like XML, there does not exist a form of data for
which a usable grove representation could not be defined. We did not
have the time or skills to mathematically prove that groves could be
used for everything. I for one did not want to make an absolute claim I
couldn't prove.

It is likely that a grove-based representation would not be *optimal*
for many kinds of data.

But that doesn't really matter because the purpose of groves is to
enable processing of data for specific purposes (addressing, linking,
transforming) and therefore does not need to necessarily express all
aspects of any particular notation, only those aspects that are needed
by the processing for which the grove has been constructed. Different
types of processing might even use different grove representations of
the same notation to suit their own specific needs.

It's important to remember that a grove is an abstraction of data (or
the result of processing data), not the data itself.

Also, whether or not a grove representation is useful or appropriate
depends as much on the implementation as it does on the details of
groves themselves. For example, it might not seem reasonable to
represent a movie as a grove where every frame is a node, but in fact a
clever grove implementation could make that representation about as
effecient as some more optimized format.  For example, you need not
preconstruct all the nodes, doing so only when necessary. Also, as
computers become faster, the cost of abstraction goes down for the same
volume of data. Ten years ago streaming media had to be superoptimized
just to be playable at all. Today we don't need that level of
optimization (what we have been doing is putting more and more
information into the same presentation time (MPEG movies) or doing more
and more compression (MP3)). 

It's also important to remember that any form of data representation,
standardized or not, will be optimized for some things and non-optimized
for others. Groves were explicitly optimized for representing data that
is like SGML and XML. It happens that SGML and XML data is more
complicated and demanding that most other kinds of data, so it's likely
that anything that satisfies those requirements will ably satisfy the
requirements of most types of data, certainly most types of structured
data.

But it's no guarantee, at least not without some mathematical proof that
I am not qualified or able to provide (not being a mathematician).

> So the first question is this:
> 
> 1) Does a Universal Data Abstraction exist?

> Note that, like a Universal Turing Machine, such an abstraction need
> not be particularly efficient or otherwise well suited to any specific
> task. The only requirement is that it be universal in the sense of
> being capable of representing any conceivable data set (or at least
> any "reasonable" data set). (And no, I don't have a formal definition
> of what "reasonable" would mean in this context; all I can say is that
> the definition itself should be reasonable....) The real importance of
> a Universal Data Abstraction is that it would provide a formal basis
> for the construction of one or more Practical Data Abstractions.

First, let me stress the importance of the last sentence: that is, I
think, the key motivator for things like groves. I want things like the
DOM, which are extremely practical, but I want them bound to a formal,
testable, standardized abstraction.

I know of two standardized universal, implementation-independent data
abstractions: groves and the EXPRESS entities (ISO 10303 Part 11).  Both
of these standards provide a simple but complete data abstraction that
is completely divorced from implementation details. For groves its nodes
with properties. For EXPRESS its entities with attributes. Both can be
used to represent any kind of data structure. These two representations
have different characteristics and were designed to meet different
purposes. There is currently an active preliminary work item within the
ISO 10303 committee (ISO TC184/SC4) to define a formal mapping between
groves and EXPRESS entities so that, for example, one can automatically
provide a grove view of EXPRESS data or an EXPRESS view of groves.

XML *appears* to be a universal data abstraction, but it's not quite,
because it is already specialized from nodes with properties to
elements, attributes, and data characters. This is why Len's recent
comment about an XML representation of VRML not working well with the
DOM is not at all surprising. Of course it doesn't. The DOM reflects the
data model of XML (elements, attributes, and data characters) not the
data model of VRML. This is always the case for XML.

I have observed that the world desperately needs a universal data
abstraction. I think that one of the reasons that XML has gotten so much
attention is that it *looks like* such an abstraction (even though it's
not).

I also don't think it really matters what the abstraction looks like in
detail--what's important is that we agree on what it is as a society.
Once we have that we can stop worrying about stupid details like how to
specify the abstract model for XML or RDF or Xlink or XSL or what have
you: you'll just do it.  

It doesn't matter whether we use groves as is or EXPRESS entities as is
or make something up that we can all agree on. What's important is that
we do it and stick to it.  I think that groves are a pretty good first
cut, but we could certainly improve on them. The advantage that groves
have at the moment is that they are standardized, they have been
implemented in a number of tools, including James Clark's Jade, HyBrick
from Fujistu, the Python grove stuff from STEP Infotek, my PHyLIS tool,
TechnoTeacher's GroveMinder product, Alex Milowski's now-unavailable
code he wrote before he got bought by CommerceOne, and others I'm sure.
It satisfies immediate requirements well, it has at least two useful
standards built around it, and it's a reasonably good base for future
refinement (about to get under way with the DSSSL 2 project being led by
Didier Martin).

> Assuming that the answer is "yes" (and I have no real justification
> other than optimism to believe that it is), the second question
> follows immediately:
> 
> 2) Does the grove paradigm, or something similar to the grove
> paradigm, constitute a Universal Data Abstraction?

Yes, obviously.

> 3) Does there exist any "reasonable" data set for which the grove
> paradigm inherently cannot provide an adequate representation?

You'd have to define adequate, but I don't think so. Groves obviously do
hierarchical stuff quite well. Relational tables are just shallow
hierarchies. Streaming media is more of a problem, but even it can be
decomposed into groups of frames or data units (e.g., movie goes to
scenes, scene goes to frames, frames carry sound and image properties).

> When attempting to answer this third question, it is important to
> avoid getting caught up in unwarranted toplogical arguments. The
> topology of groves may not map onto the topology of a particular data
> set, but that does not mean that that data set is unrepresentable as a
> grove. Consider XML: An XML document consists of a linear, ordered
> list of Unicode characters, yet the XML format is quite capable of
> representing any arbitrary directed acyclic graph.

This is a very important point and it's well worth stressing again. Any
"universal" data abstraction will be suboptimal for many types of data
or data structures. That's what implementations are for, getting the
optimization characteristics needed by specific applications or use
environments. 

The main purpose, in my mind, for a universal abstraction like groves is
to enable reliable addressing (because you have some common basis on
which to define and predict the structures of things) and to enable the
creation of data access APIs that may be individually optimized for
different use scenarios but that are all provably consistent because
they all reflect the same underlying data model.

> ========
> 
> On a somewhat related note, I've noticed that in discussions regarding
> the Power of Groves, the arguments by the proponents seem to fall into
> two distinct groups. On the one hand, some people see groves as being
> quite universal in their applicability. On the other, some people talk
> about groves almost exclusively within the context of SGML, DSSSL
> and/or HyTime. As an outsider and relative latecomer to the party, I
> find it difficult to determine whether this dichotomy of viewpoints is
> real, or merely reflects the differences in the contexts in which the
> discussions have taken place. If the schism _is_ real, it would be
> helpful if those sitting on either side of the fence could add their
> thoughts regarding why the schism is there, and why the people on the
> other side are wrong. :)

I think it's largely a function of context. But it's important to
remember that groves were defined as part of a larger standards
framework of which SGML, DSSSL, and HyTime are the chief parts. There is
a sense in which these three standards cover pretty much all of data
representation and access at the abstract level (as opposed to the
implementation level, where we rely on things like APIs, programming
langauges, communications protocols, and other building blocks of
working systems).  But groves certainly have general application outside
the use of the DSSSL and HyTime standards. It's just that the ability to
implement those standards is what has motivated most of us who have
implemented groves.

Because groves can be applied to any kind of data (per the discussion
above) it follows that the DSSSL and HyTime standards can be applied to
any kind of data. That is, I can do generalized, consistent linking,
addressing, styling, and transforming of anything I can put into a
grove, which is anything. That covers almost all of what one needs to do
to data in an application. This provides tremendous leverage once you
have the layers of infrastructure built up.

> An example of why I am concerned by this question is given by the
> property set definition requirements in section A.4 of HyTime. The
> definition of property sets is given explicitly in terms of SGML. That
> is, a property set definition _is_ an SGML document. But it seems to
> me that if property sets have any sort of widespread applicability
> outside of SGML, then a property set definition in UML or IDL or some
> other notation would serve just as well (assuming that those other
> notations are sufficiently expressive; I'm fairly confident that UML
> is, but I'm not so sure about IDL).

I agree completely. That is one reason we're working on rationalizing
EXPRESS and groves. As part of that effort, we have created EXPRESS
models for the SGML and HyTime property sets, providing an example of
using a more generalized formal modeling language to specify the data
models the groves reflect. You could, of course, do the same thing with
UML and define a generic algorithm for going from a UML model to a grove
representation of the data objects conforming to that model. One key
problem we ran into with EXPRESS (and would run into with UML) is that
groves have the explicit and essential notion of name spaces (for
addressing nodes by name, not disambiguating names). EXPRESS has no
formal notion of grove-style name spaces, nor does UML. You can define
the appropriate constraints using population constraints (OCL in UML),
but it's not nearly as convenient as in a property set definition
document.

> Of course, it can be argued that _some_ notation had to be used, so
> why not SGML? My response to that is that I believe that the
> mathematical approach of starting with a few extremely basic axioms
> and building on those as required to develop a relevant "language" for
> expressing a model would be far superior, as it would allow people to
> fully visualize the construction of the property set data model (or
> "metamodel," if you prefer), without getting bogged down in arcane
> SGML jargon. After all, SGML can hardly be described as minimalist.

Again, I couldn't agree more. We have what we have largely because we
were in a hurry and it was expedient (and because it's what James Clark
did and, at the time, the rest of the editors didn't have anything
better to offer). It's too bad that we didn't appreciate the existence
or applicability of EXPRESS at the time, because if we had we very well
might have used it. 

But in any case, it would be easy enough to revise the spec to provide a
more complete and modern formalism. There's no particular magic to the
property set definition document except that, being in SGML/XML form, it
was easy for us to process and work with. 

> (An aside: I believe that a lot of the resistance to acceptance of
> SGML and HyTime has its basis in the limitation of identifiers to
> eight characters, leading to such incomprehensible abominations as
> "rflocspn" and "nmndlist." Learning a completely new body of ideas is
> hard enough without having to simultaneously learn a foreign--not to
> mention utterly unpronounceable--language.)

Almost certainly true. We felt that we had an obligation for backward
compatibilty with legacy SGML, which meant that we had to have names
that could be used with the reference concrete syntax.  Not sure that we
could have done otherwise. It's a historical legacy just like 512 scan
lines for TV signals. In practice it probably wouldn't have caused
anyone harm if we had required support for longer names. 

Cheers,

Eliot

From cbullard@hiwaay.net Thu Feb 10 11:28:03 2000
Date: Mon, 07 Feb 2000 19:02:58 -0600
From: Len Bullard <cbullard@hiwaay.net>
Reply-To: "cbullard@hiwaay.net" <"Len Bullard"@mail.HiWAAY.net>
To: "W. Eliot Kimber" <eliot@isogen.com>
Cc: xml-dev@xml.org
Subject: Re: The Power of Groves

I need to trim out some of this excellent discussion because the point 
I've been looking for is the applicability of groves, what they are 
good for, and how that might be of use in situations such as we face 
in VRML where multiple spec parents force ugliness into the
implementations. 
In some respects, for VRML, that is spilt milk, but I've done this 
long enough to see standards come and go, so perhaps it is the 
tools, techniques, and practices of standardization that should be 
examined.  We know XML can't do the job in all cases for all kinds 
of application languages.  The *semantic web* and the *fractal 
web* are largely delusions, fun, but not very useful.

<aside>The first idiot that mutters "Internet Time", take this to the 
bank, you get us into the messes.</aside>

>W. Eliot Kimber wrote:
> 
> XML *appears* to be a universal data abstraction, but it's not quite,
> because it is already specialized from nodes with properties to
> elements, attributes, and data characters.

It has an abstract model:  roughly, the InfoSet.

> This is why Len's recent
> comment about an XML representation of VRML not working well with the
> DOM is not at all surprising. Of course it doesn't. The DOM reflects the
> data model of XML (elements, attributes, and data characters) not the
> data model of VRML. This is always the case for XML.

Yes.  But note that I did not say we could not create an XML encoding of 
VRML.  Trees and bushes map as long as the ground can be a root.  No 
mysticism in this:  the problem of this mapping is that there are 
multiple meta contracts governing a shared description of a semantic. 
We have X3D and it is XML; it will simply be ugly because we are 
asked to use wrapper tags to cover the mismatched abstractions. 
If you had replied, "groves could be used to make that abstraction 
more useful" I would have thought groves more interesting for the 
problem:  a meta contract for multiple encodings, perhaps more 
descriptive, perhaps as an adjunct to the abstract IDL.

> I have observed that the world desperately needs a universal data
> abstraction. I think that one of the reasons that XML has gotten so much
> attention is that it *looks like* such an abstraction (even though it's
> not).

Begging a bit.  XML is SGML.  It gets a lot of attention because it 
is a) a good solution for a set of knotty problems b) the solution 
espoused by a currently powerful consortium.  It is good news and maybe 
for the wrong reasons, but selah.

> I also don't think it really matters what the abstraction looks like in
> detail--what's important is that we agree on what it is as a society.

Societies do not implement; societies do not agree.  Societies can 
be said to be characterized by agreements, but agreements are first 
and foremost contracts among individuals.

> I think that groves are a pretty good first
> cut, but we could certainly improve on them. The advantage that groves
> have at the moment is that they are standardized, 

That is perceived.  Standards are just paper with names on them.  
If HyTime's history vs the history of HTML and XML prove anything, 
it proves the value of the names often outweigh the value of the 
contents to which names are appended.  Sad but so.

> > 2) Does the grove paradigm, or something similar to the grove
> > paradigm, constitute a Universal Data Abstraction?
> 
> Yes, obviously.

Unproven and I think, unprovable.

> Because groves can be applied to any kind of data (per the discussion
> above) it follows that the DSSSL and HyTime standards can be applied to
> any kind of data. 

I think that is unproven but it is an interesting assertion.  A problem 
is that as long as DSSSL and HyTime remain obscure and impenetrable, 
they will not be adopted for such.  It is impractical to create 
yetAnotherStandardsPriesthood.

> I agree completely. That is one reason we're working on rationalizing
> EXPRESS and groves. As part of that effort, we have created EXPRESS
> models for the SGML and HyTime property sets, providing an example of
> using a more generalized formal modeling language to specify the data
> models the groves reflect. You could, of course, do the same thing with
> UML and define a generic algorithm for going from a UML model to a grove
> representation of the data objects conforming to that model.

Que bueno.  I hope that work makes it way into the process and practice 
of standardization.  IMO, we have many of our current problems because 
standardization as practice and technique is still a black art and 
still overdriven by documentation constraints, eg, the need to 
stay within the constraints of the prior version once a standard 
is adopted as such.  This is quite an incentive and means to 
perpetuate consultancies and political power bases beyond reasonable 
returns.

> It's too bad that we didn't appreciate the existence
> or applicability of EXPRESS at the time, because if we had we very well
> might have used it.

That strikes me as odd.  We certainly did know about it.  The original 
work on EXPRESS and SGML takes place before groves.  That is why the 
PDES model had the SGML string for documents.  Dr Goldfarb was certainly 
aware and in fact, some sought funding to make it possible for he and 
the inventor of EXPRESS to work together on a harmonization.  Like 
many good things CALS money could have done, I suspect that is another
that 
got lost in the grime of consultancy.  Those who started that effort 
as usual got called back to their companies and told it wasn't 
important.  There is always time but not always support.  Que lastima.

> In practice it probably wouldn't have caused
> anyone harm if we had required support for longer names.

Obscurity bites in most prose, but in a standard, it is a fatal flaw.

If the work on HyTime is not to be lost, present what it is 
good for, how to apply it, and perhaps, show this in the 
framework of real problems.  

Is a grove a means to standardize?  Is it better and why?  For what?
In 50 words or less.

len

From eliot@isogen.com Thu Feb 10 11:28:28 2000
Date: Tue, 08 Feb 2000 09:44:34 -0600
From: "W. Eliot Kimber" <eliot@isogen.com>
To: "cbullard@hiwaay.net" <"Len Bullard"@mail.HiWAAY.net>
Cc: xml-dev@xml.org
Subject: Re: The Power of Groves

Len Bullard wrote:

> >W. Eliot Kimber wrote:
> >
> > XML *appears* to be a universal data abstraction, but it's not quite,
> > because it is already specialized from nodes with properties to
> > elements, attributes, and data characters.
> 
> It has an abstract model:  roughly, the InfoSet.

Yes, it has an abstract model, but what is the abstract model that
underlies the XML abstract model? Within the infoset (or the SGML
property set), "element" is a specialization of "node". It is "node"
that is the base underlying abstract data model from which the
specialized types "element", "attribute", "data character", etc. are
defined. Without this completely generic, universal, base, there is no
way to meaningfully compare different data models to define, for
example, how to map from one to other, because they are not defined in
terms of a common definitional framework.

> If you had replied, "groves could be used to make that abstraction
> more useful" I would have thought groves more interesting for the
> problem:  a meta contract for multiple encodings, perhaps more
> descriptive, perhaps as an adjunct to the abstract IDL.

I thought that's what I said. Let me say it explicitly: groves could be
used to make that abstraction more useful.

> > It's too bad that we didn't appreciate the existence
> > or applicability of EXPRESS at the time, because if we had we very well
> > might have used it.
> 
> That strikes me as odd.  We certainly did know about it.  

*I* didn't know about it. James didn't know about it (or if he did,
didn't mention it). The original "STEP and SGML" work was about storing
SGML strings in as values of EXPRESS entity attributes, not about using
EXPRESS to model SGML. With Yuri Rubinksi's untimely death, the original
driving force behind the effort died. It wasn't until 1998 that Daniel
Rivers-Moore resurrected the effort and convinced me to participate.  

> Is a grove a means to standardize?  Is it better and why?  For what?
> In 50 words or less.

<fifty-words>
Groves, by providing a generic, basic, universal abstract data model
provide a formal basis for defining data models for specific data types,
e.g., XML, VRML, relational tables, etc. This provides a basis for
standardizing abstract data models and enables the application of
generic processing to data in a provable, testable, way.
</fifty-words>

UML, for example, provides a good data modeling language and a good
implementation specification language, but it doesn't provide a *data
instance representation* abstraction, which is what groves are. But, for
example, it would be possible to define a generic mapping between UML
models and the grove representation of data instances for the model.

Note that this approach is different from the typical approach of using
UML to go straight to implementations from the abstract model. Putting
the additional layer of groves between the abstract data model and
implementation gives you additional leverage and flexibility and enables
the development of generic data management layers like GroveMinder
(imagine a DOM-type application environment that is not limited to the
processing of XML-based data).

Cheers,

E.



From KenNorth@email.msn.com Thu Feb 10 11:28:47 2000
Date: Tue, 8 Feb 2000 12:47:18 -0800
From: KenNorth <KenNorth@email.msn.com>
To: "W. Eliot Kimber" <eliot@isogen.com>,
    "cbullard@hiwaay.net" <"Len Bullard"@mail.HiWAAY.net>
Cc: xml-dev@xml.org
Subject: Re: The Power of Groves

> Len Bullard wrote:
>
> > >W. Eliot Kimber wrote:
> Without this completely generic, universal, base, there is no
> way to meaningfully compare different data models to define, for
> example, how to map from one to other, because they are not defined in
> terms of a common definitional framework.

Object Role Modeling (ORM) is an effective solution for creating conceptual
models. Hopefully, the existing tools for ORM are being updated to support
XML (perhaps even groves).

See www.orm.net.

You said:
"UML, for example, provides a good data modeling language and a good
implementation specification language, but it doesn't provide a *data
instance representation* abstraction"

Terry Halpin said:
"Although the Unified Modeling Language (UML) facilitates software modeling,
its object-oriented approach is arguably less than ideal for developing and
validating conceptual data models with domain experts."

http://www.orm.net/uml_orm.html

"A comparison of UML and ORM for data modeling "
From the proceedings of EMMSAD'98: 3rd IFIP WG8.I International Workshop on
Evaluation of Modeling Methods in Systems Analysis and Design
http://www.orm.net/pdf/orm-emm98.pdf


================== Ken North =============================
See you at SIGS Java Developer Conference (London, March 13-15, 2000)
www.javadevcon.com
http://ourworld.compuserve.com/homepages/Ken_North
===========================================================

From cbullard@hiwaay.net Thu Feb 10 11:29:26 2000
Date: Tue, 08 Feb 2000 21:36:18 -0600
From: Len Bullard <cbullard@hiwaay.net>
Reply-To: "cbullard@hiwaay.net" <"Len Bullard"@mail.HiWAAY.net>
To: "W. Eliot Kimber" <eliot@isogen.com>
Cc: xml-dev@xml.org
Subject: Re: The Power of Groves

This will get tedious.  I apologize, but I think we 
have to tear this down to atoms to get back to the 
original queries about the applicability of HyTime and Groves, 
and before that, why the W3C specs don't seem to cohere.

> Len:  It has an abstract model:  roughly, the InfoSet.
> 
> Eliot:  Yes, it has an abstract model, but what is the abstract model that
> underlies the XML abstract model? Within the infoset (or the SGML
> property set), "element" is a specialization of "node". It is "node"
> that is the base underlying abstract data model from which the
> specialized types "element", "attribute", "data character", etc. are
> defined. 

Roughly

(ELEMENT | ATTRIBUTE | DATA CHARACTER) IS_A NODE

ok.  You have three names and you named them.  

So, then is the claim that the XML <!ELEMENT IS_A infoSet Element 
is not definitionally complete?  Is this yourNames vs theirNames 
or is there a deeper issue here?

> Without this completely generic, universal, base, there is no
> way to meaningfully compare different data models to define, for
> example, how to map from one to other, because they are not defined in
> terms of a common definitional framework.

My problem here is that we seem to be in an MMTT trap.  That is, 
I can point to at least four other languages that claim the 
*name* "node".  The trick is to prove that what each calls a node 
is the same.  

As you say, "not defined in terms of a common definitional framework."

What common definitions?  Are these common definitions or 
common semantics?

> > If you had replied, "groves could be used to make that abstraction
> > more useful" I would have thought groves more interesting for the
> > problem:  a meta contract for multiple encodings, perhaps more
> > descriptive, perhaps as an adjunct to the abstract IDL.
> 
> I thought that's what I said. Let me say it explicitly: groves could be
> used to make that abstraction more useful.

Thank you.  How? By providing "a common definitional framework"?  
We must establish the requirements for the "common definitions".  
OTW, we risk the descent into MMTTHell (reaching for heaven, 
we open the gates of perdition) or we specify a non-closing task.  
Referring back to an earlier email in this multiThread, projects need 
a definition of "done".  Posit:  if we can show that three existing 
metalanguages can be rigorously and completely specified with 
groves, we are done. (Want to add to that anyone?  Requirements 
are negotiable.)

> > > It's too bad that we didn't appreciate the existence
> > > or applicability of EXPRESS at the time, because if we had we very well
> > > might have used it.
> >
> > That strikes me as odd.  We certainly did know about it.
> 
> *I* didn't know about it. James didn't know about it (or if he did,
> didn't mention it). 

Then do your homework next time and tell James to do his.  OTW, how 
can we ask the W3C to keep reinventing the wheel when one can show 
by precedent we don't.  :-)  All of the CALS and most of the serious 
SGML community did know it because they competed for funding and 
when the PDES community proposed a generic document model.  Two 
representatives of the SGML community went to the first meeting to 
propose that SGML be used instead.  The PDES working group was quite far
along 
at that juncture.  Fortunately, they were not too enamored of 
what they had and other competitors (notably, compound document 
architectures) were considered more credible, so they were listening.  

> The original "STEP and SGML" work was about storing
> SGML strings in as values of EXPRESS entity attributes, not about using
> EXPRESS to model SGML. 

That was the first step taken at the meeting.  Harmonization of the 
models wasn't possible.  There was no "common definitional framework" 
and the jockeying in the room for "who owns the parse" was fairly 
serious.  Storing the SGML string as an entity attribute was a 
compromise.  Most attendees couldn't go further than that.  We 
spent most of that meeting making drawings of our respective 
declarative techniques and trying to understand each other.  

> With Yuri Rubinksi's untimely death, the original
> driving force behind the effort died. 

Actually, the funding went away.  Of the two members, Yuri 
had some control over his.  As the other SGML member there, 
I had to take the customer's decision (USAMICOM) that PDES 
was of no immediate interest with regards to documentation 
because in the near term, no viable technologies were 
emerging and local systems such as IADS had proven that 
markup using a combination of fixed tag sets and stylesheets 
were adequate to cause:  IETMs.

> It wasn't until 1998 that Daniel
> Rivers-Moore resurrected the effort and convinced me to participate.

Good.  It is worthy to do, IMO, and always was.

> > Is a grove a means to standardize?  Is it better and why?  For what?
> > In 50 words or less.
> 
> <fifty-words>
> Groves, by providing a generic, basic, universal abstract data model
> provide a formal basis for defining data models for specific data types,
> e.g., XML, VRML, relational tables, etc. This provides a basis for
> standardizing abstract data models and enables the application of
> generic processing to data in a provable, testable, way.
> </fifty-words>

Good opening definition of some requirements for what Groves 
must be proven to provide.  Keep it where we can revisit it. 

We have to show the usefulness, the code worthiness, of defining 
such standards.  This is the goal.  Otherwise, precisely as 
Steven asserts, we base our technology and create our information 
over the whimsy of consortia and powerful companies.  This is not 
to assert conspiracy.  Where two meet to agree, they conspire.  
However, where they express that agreement in terms that all 
understand, they create standards.  HyTime is not understood. 
Therefore, it is largely unused.

That can be changed.  Next, the VRML model for comparison 
to answer Didier.

len

From cbullard@hiwaay.net Thu Feb 10 11:30:04 2000
Date: Wed, 09 Feb 2000 19:25:17 -0600
From: Len Bullard <cbullard@hiwaay.net>
Reply-To: "cbullard@hiwaay.net" <"Len Bullard"@mail.HiWAAY.net>
To: "W. Eliot Kimber" <eliot@isogen.com>
Cc: xml-dev@xml.org
Subject: Re: The Power of Groves

W. Eliot Kimber wrote:
> 
>... given that "element" is_a
> "node", I know things about elements simply because they are nodes. I
> can write generic software that can, for example, do addressing of
> element nodes without knowing anything about the semantics of elements.
> 
> Groves and property sets are purely about using a common language to
> describe the characteristics of data instances so that generic software
> (e.g., a HyTime engine, a DSSSL engine) can process it and so that you
> can write processing standards without having to say anything about
> implementation-level data structures.

> Until you define what a node is, you have no formal basis for
> comparing two different "nodes". If you have such a definition then you
> can define a formal correspondence through a single common form.
> Otherwise you are faces with an N x N mapping.

> Common definitions: the Property Set Definition Requirements annex,
> which defines the basic rules for grove definition and (abstract)
> representation.

Then it seems to me this is a very good definitional tool for creating 
languages.  It would be a means to unify language definitions, proceed 
by common means, and have results which are clear, understandable, and 
implementable.  This would meet the needs as set forward by 
earlier posters about the lack of unity in W3C language standards.

Yet, David Megginson understands groves.  Paul Grosso understands 
groves.  Steve deRose understands groves.  Given such understanding 
and the influence of these and perhaps more of the W3C community, 
why are groves not applied to these problems?  I don't think a 
defense that starts "what problems?" is viable.  Too many reasonable 
and trained specialists on this list say the problems exist.

Are groves a suitable solution?  I need to 
answer Didier with the VRML metamodel so his excellent examples 
can be applied and Robin Cover can in his thorough way, harvest 
the results for study.  We have the basic idea as you have 
presented, but now we should step through some well known problems.

> Freakin' bite me. I didn't know about it, for whatever reason. 

Are you sure?  You have no idea where my teeth have been lately. :-)

Ok.

I am *abusing* Eliot a little here because we need to have some 
better understandings in our community and this is an example 
of how easy the misunderstandings perpetuate, and in the rapid 
feedback of lists to lists to lists, amplify.

It is VERY possible that even where the community is small 
and as tightly knit as SGML was at that time, for efforts to occur 
that other members are completely unaware of.  It was more 
possible then because we did not conduct anywhere near as 
much business on lists.  There are always the *conspiracies* 
of ego and money, but apart from these,  parallel 
duplicative efforts just happen.  That is a given.

What we should not allow, and only by some concerted effort 
can we stop, is to proceed in parallel efforts without 
common definitions, particularly when these exist, and the 
expertise to use them exists.  It is not the power of 
groves that holds my interest; it is the way of groves.

OASIS owns this list now.  OASIS emerged from the SGML infrastructure.  
It has new blood and the W3C has XML but some part of 
OASIS should honor its origins.   If OASIS wants to be 
a force for open standards, wants to own processes, and wants, 
even desperately needs the resources of XML DEV, then it 
should be very cognizant of the requests from this 
list to ensure open processes.  If the current polity 
cannot do this, then the tools that ISO created such as 
groves to ensure open, coherent standards should be 
used by a more resilient community, dedicated, willing 
and able to carry out the work unafraid of the press, 
the whispers of powers from MIT, the incursions of 
the powers from Silicon Valley and Redmond, unafraid 
of anything but exhaustion.  Megginson can't carry 
the load for SAX, but I too, like Simon, would think twice 
about surrendering it if it goes behind closed doors. 

And I, like Steven and Eliot, believe we should consider 
the tools made available, freely, openly, and by dint 
of years of hard work.  I do not, like others, think that 
we are making this up as we go.  SAX was done 
here.  Xschemas were done here before they were turned 
over to the W3C.  The existence proof refutes the 
position that we are making this up as we go.

You do know how.

Getting DSSSL and SGML to cohere took tremendous effort from 
a small group of people, and even though they had to inform 
each other, and sometimes be reminded of responsibilities, 
they did it.   We have incredible resources that did not 
exist then, yet it seems, we make mistakes which we can 
easily avoid given the resources left to us by that group.   
Using the analogy given by Gates in the Pirates 
of Silicon Valley (whatever..), is as if XML burgled the house of 
SGML, but took the TV and left the diamond pendant on the 
kitchen counter.

Groves.  Let's keep going in this thread and see if it 
is the jewel.

Back to VRML.  I still need to respond to Didier and 
for me, this list is the after hours hobby I wish 
it were not. 

len

Date: Thu, 10 Feb 2000 13:27:50 -0600
From: "W. Eliot Kimber" <eliot@isogen.com>
To: xml-dev@xml.org
Subject: Re: The Power of Groves

[...]

[Len Bullard]
> It is VERY possible that even where the community is small
> and as tightly knit as SGML was at that time, for efforts to occur
> that other members are completely unaware of.  

History often comes down to the random events influencing a single
person. (Not that I'm suggesting that the development of HyTime was some
sort of major historical event, just that in group efforts, key results
it often comes down to one or two people. High dependence on initial
starting conditions.)

> Groves.  Let's keep going in this thread and see if it
> is the jewel.

Good. Let me stress that when I use the term "groves", I usually mean
"some technical solution that satisfies the requirements we tried to
satisfy with groves as defined in 10744". I have no long-term investment
in groves *as defined in 10744*. I would be perfectly happy if the W3C
developed from scratch some new way of doing what we did with groves. My
concern is with satisfying requirements, not perpetuating a particular
solution.

Cheers,

E.

From ldodds@ingenta.com Thu Feb 10 11:30:27 2000
Date: Thu, 10 Feb 2000 14:50:51 -0000
From: Leigh Dodds <ldodds@ingenta.com>
To: xml-dev@xml.org
Subject: XML-DEV on Groves (was RE: The Power of Groves)

> -----Original Message-----
> From: owner-xml-dev@xml.org [mailto:owner-xml-dev@xml.org]On Behalf Of
> Peter Murray-Rust
> Sent: 10 February 2000 11:25
> To: xml-dev@xml.org
> Subject: Re: The Power of Groves

<snip!>
> Perhaps JamesC's posting (I am offline so cannot pinpoint it) is
> a starting point.
>
> I do, however, recall an analysis by Henry Thompson of the complete grove
> diagram of a very simple XML file with 1-2 elements types and attributes
> and including a DTD and it was surprisingly complex. I don't think it was
> on this list - probably XML-SIG. Groves are not trivial.

I've just tried to collate a few XML-DEV resources relating to
this topic.

March 97. "A Simple API".
http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Mar-1997/0187.html

http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Mar-1997/0189.html

The relevant example being:
"SGML Groves: A Partial Illustrated Example"
http://www.cogsci.ed.ac.uk/~ht/grove.html

JamesC, posts a 'ReallySimple' API based on groves, which is what
Peter is referring to above.
http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Mar-1997/0192.html

Other random tidbits from a quick search:

http://www.lists.ic.ac.uk/hypermail-archive/xml-dev/xml-dev-Jun-1997/0117.ht
ml

http://www.lists.ic.ac.uk/hypermail-archive/xml-dev/xml-dev-Oct-1998/0084.ht
ml

http://www.lists.ic.ac.uk/hypermail-archive/xml-dev/xml-dev-Nov-1998/0469.ht
ml

http://www.lists.ic.ac.uk/hypermail-archive/xml-dev/xml-dev-Nov-1998/0480.ht
ml

Cheers,

L.

From robin@isogen.com Thu Feb 10 11:31:13 2000
Date: Fri, 28 Jan 2000 07:48:32 -0600 (CST)
From: Robin Cover <robin@isogen.com>
To: XML Dev <xml-dev@ic.ac.uk>
Cc: uche.ogbuji@fourthought.com,
    Lars Marius Garshol <larsga@garshol.priv.no>,
    XML Dev <xml-dev@ic.ac.uk>
Subject: Re: Seeking a Dao of Groves

On Fri, 28 Jan 2000 uche.ogbuji@fourthought.com wrote:

[groves]
> > However, it has not been presented in an effective manner. Almost the
> > only piece of information about it that is easy to understand for a
> > programmer is Paul Prescod's tutorial[1]. I have read it, worked with

I can't help you with the Dao of Groves, but I do try to maintain
a collection of references to Grove exposition.  See:

"Groves, Grove Plans, and Property Sets in 
SGML/DSSSL/HyTime"

http://xml.coverpages.org/topics.html#groves

This topical section contains references to a number of
shorter contributions -- which may or may not bring
enlightenment.  See, for example, some clarifying
comments by Steve Newcomb in September 1999:

http://xml.coverpages.org/newcombGroves19990908.html

It's my belief that software based upon the "grove
paradigm" is being used with good success to solve some
very thorny problems.  To see the demos (and applications)
is pretty cool.

It's also true that "groves" represents a high level of
abstraction, and one still needs to (a) invent the property
set for the corresponding representations, and (b) decide
whether the theoretical base of SGML/DSSSL/HyTime/XML
is an ideal modeling heuristic.  As I understand the
situation, the Grove constructs do not conform entirely
to prevailing "object oriented" paradigm, which does not
privilege a (markup-based) distinction between "content"
and "not-content."  Whatever possible inconcinnities lurk
here (arguably a feature not a bug), I do witness some
interesting grove-based applications.  Most are large
scale, for solving large-scale problems.  Check out
GroveMinder.

- Robin Cover

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ or CD-ROM/ISBN 981-02-3594-1
Unsubscribe by posting to majordom@ic.ac.uk the message
unsubscribe xml-dev  (or)
unsubscribe xml-dev your-subscribed-email@your-subscribed-address

Please note: New list subscriptions now closed in preparation for transfer to OASIS.

Prepared by Robin Cover for the The SGML/XML Web Page archive.