[This local archive copy is from the official and canonical URL, http://www.mozilla.org/xpfe/xulrdf.htm, 1999-02-04; please refer to the canonical source document if possible.]
XUL and RDF: The Implementation of the Application Object Model
Written by Dave Hyatt
Purpose - The purpose of this document is two-fold. The first section of the document describes the motivation and reasoning behind using RDF as the foundation of XUL. This section makes a technical argument both for having XUL in the first place and for using RDF as the underlying implementation of XUL's content model. The second section describes the XUL/RDF architecture itself and outlines enhancements to the XUL language in order to allow the markup language to reference local data and to indicate how and when it would like to be annotatable with local data.
The Case for the XUL/RDF Approach
What is XUL?
XUL stands for "extensible user interface language". It is an XML-based language for describing the contents of windows and dialogs. XUL has language constructs for all of the typical dialog controls, as well as for widgets like toolbars, trees, progress bars, and menus. Where HTML describes the contents of a single document, XUL describes the contents of an entire window (which could itself contain multiple HTML documents).
The HTML content tree structure for a single document is represented as a set of objects that can be accessed and manipulated. This is referred to as the DOM (document object model). In a similar fashion, XUL's content tree for a single window is represented as a set of objects that can be accessed and manipulated. This is referred to as the AOM (application object model).
How are XML DOM/AOM trees represented in NGLayout?
What is RDF?
RDF provides a very general mechanism for representing relationships between different disparate types of data. The relationships between data are represented as a directed labeled graph structure. The data itself can be fed in from any number of different sources (e.g., from a local file system, from your bookmarks file, or from some remote downloadable RDF file), and can then be combined into a single graph.
Ok, so after that paragraph many of you probably still don't understand what RDF does, so here's the short and sweet version. RDF can suck up data from different places (like your bookmarks and history or another web site), and it can combine them. This gives you a feature called aggregation, the ability to put completely different kinds of data into the same place. For example, the traditional bookmarks tree view could contain anything from mail messages to local files to maps of other sites.
This same ability to aggregate data leads to another capability of RDF. Because RDF can suck up data from a remote location (e.g., from a downloadable RDF file), and also suck up data from a local source, it can take the remote data and combine it with local data. You'll often hear people refer to this as RDF's local/remote merging capability. This capability is desirable for two reasons.
What is local/remote merging good for?
The second reason to have local/remote merging is to save state and/or to record customizations that the user has made to his or her user interface. Imagine a remote file that describes the user's personal toolbar. The downloaded file contains an AOL Instant Messenger button, and the user really doesn't want that on the toolbar. Now since the AOL bookmark was specified by the remote file, the user has no way of going off to the site and changing that file just to get rid of the button (even though the user might wish that he or she could!).
What the user needs to be able to do is make a local annotation to the remote file, e.g., to save somewhere on the local hard drive the fact that the AOL Instant Messenger button shouldn't be on the toolbar. When the remote file is subsequently downloaded, these local annotations are sucked in and superimposed on top of the structure described by the file. The resulting structure is what the user actually ends up viewing.
Well, that's just plain cool! I say forget XUL! Let's just use serialized
RDF to represent the UI!
Still I say forget XUL! Let's just extend HTML and use it to represent the
Ok, so forget XUL period, and forget RDF too! We don't need any of this
complicated nonsense! Just let us go off and make our widgets and build our
applications the old-fashioned way! We're still cross-platform, since we'd only
have to write widgets like trees and toolbars once! I say, forget this whole
thing! Let's just ship a product!
Ok, ok. So you've convinced me that XUL is the way and the light. But why
do you need to use RDF at all? nsIContent, nsIDOMElement, and all of those interfaces
mentioned at the beginning of this document are just that: interfaces! Couldn't
you just make mail content and bookmarks content etc. and have them implement
That answer was the incorrect counter to the question. The answer itself implied a concession that some newly-architected system that connected directly into the DOM APIs would be preferable to RDF if only there were time to engineer it. That is simply not the case. To discover why, let's explore what this pluggable content architecture would have to look like in order to match the feature set and functionality we need.
First off, let's consider how we'd figure out what kind of node to instantiate, e.g., a bookmarks folder node vs. a mail folder node. In the XUL, you could use a syntax like <toolbar localData="mailbox:blah"/>, which would specify a URI that pointed to a specific mailbox node.
Our architecture must know how to examine this localData attribute to determine not only which kind of pluggable content needs to be instantiated, but that also has to determine which specific NODE should be instantiated. In order to accomplish this, our architecture has to have some sort of facility whereby different types of content, e.g., mail and bookmarks, can register themselves as the appropriate content to be instantiated for a given URI. We could do this using a registry that can map strings to CIDs, so we have a story for instantiating our different content nodes.
But now let's consider one of these nodes, e.g., nsMailElement. Let's look at the set of interfaces that nsMailElement needs to implement in order to exist in a XUL tree as fully scriptable content. nsIContent, nsIDOMNode, nsIDOMElement, and nsIXMLElement must be implemented at a minimum. Four interfaces with over 100 methods combined, a significant portion of which are redundant. Every new kind of content node would have to implement all of these interfaces. Several of the implemented functions would even have identical implementations, i.e. duplicate code that would be doing the same thing in the function bodies.
Here's another problem. The various interfaces, nsIContent et.al., are not yet solidified. They are likely to change following the first release of NGLayout, and when they do, anyone that implemented one of those interfaces will have to change as well in order to upgrade to the new world.
Statement #1: The two points raised in the previous two paragraphs, namely (1) redundancy of methods in the interfaces as well as a likely code redundancy in the implementation of some of those methods, and (2) the desire to be insulated from the layout DLL should those interfaces change, imply that a layer needs to exist between the pluggable content and the content tree interfaces.
This layer would serve several useful functions. First of all it could streamline the redundancy in the interface methods and present a new interface for pluggable content that was much smaller and easier to plug into than the 4-5 interfaces required if directly implementing the content tree interfaces. Furthermore, should the content tree interfaces change, only this layer would need to be updated. The pluggable content, safe behind this layer, wouldn't have to change at all.
One natural solution to try for implementing such a layer might be inheritance. However, in this XPCOM world, where each type of pluggable content is off in its own DLL, there's no clean way to inherit functionality from some base class implementation, when that implementation must necessarily reside in a different DLL, without introducing a code dependency between all pluggable content and the common base class.
This inability to provide a cleanly inherited system argues for a different approach, namely that all content node implementations be the same kind of object, and that those objects communicate with their pluggable content through this new streamlined interface we talked about earlier. We'll call this new interface a pluggable data source.
Just in case you still aren't convinced that all content nodes should be the same kind of element, consider another problem: how to implement aggregation. Suppose that a bookmarks folder contains a mailbox folder, a composer page template, and a bookmark. In order to achieve aggregation of data, a content node implementation cannot make any assumptions about what kind of children it holds. It can only refer to its child nodes through the various content tree interfaces. What we run into now is the problem of how one local content node knows how to instantiate other kinds of content nodes.
The only way that one content node would know how to instantiate a content node of a completely different type is if it had additional information stored for every child content node that it contained. It would have to consult the registry of data sources in order to instantiate each of its children.
If all content nodes are of the same type this problem can be solved in a cleaner fashion. A single content node could be initialized with its URI by its parent node, it could store its URI in a member variable, and it could use that as a basis for resolving the pluggable data source from which it would obtain its information.
Statement #2: All content nodes that reside in the AOM and that implement the content tree interfaces must be the same class of object.
So now we're on the right track, but there are still some flaws in our architecture. Let's consider another required feature that has heretofore gone unmentioned in this document: the need to take the same set of data and present it as completely different content models. The perfect example of this requirement is the Personal Toolbar. The Personal Toolbar must show up in a tree widget (in which case it has to be faking a tree content model, complete with <treeitem> and <treecell> nodes), or it must be able to show up on a toolbar (complete with <button> nodes and popup trees attached to folder buttons).
In our current architecture, we have a set of content nodes obtained from any number of data sources. We've solved our aggregation of data problem, but we have no efficient way of taking the data and representing it in different ways, since we'd have to go back to the data sources to re-aggregate everything into a new tree.
"Aha!" some of you might be saying. "Couldn't you just perform a tree transformation on whatever representation you have in memory?" The answer is "Yes, provided there is one single common intermediate representation of the collected and aggregated data to use as the basis for the translation."
"Why?" you ask. Well, let's take this problem to the natural extreme, and assume that there are n total possible representations for the same group of data. Then without some common internal representation of the data, it would be necessary to implement n*n total translators in order to guarantee that for whatever content model you happen to have built that the transformation could be applied. If there is a common intermediate representation of the data in question, then we need only implement n translators, one for each content model representation that can hold our aggregate data.
Yet another example of this problem arises from the need to perform fast sorts on a potentially large number of content items. Using the DOM APIs to walk the content nodes and reorder them would be an act of madness. Futhermore, the original natural order of the items (e.g., in the case of bookmarks) would be lost. When performing the sort, it would be advantageous to be able to form the new sorted content model without losing or disrupting the natural sort.
Statement #3: The fact that the same hunk of aggregate data can be represented as any number of different content models (e.g., sorted, or as a toolbar, a tree view, or a menu) implies a need for a common intermediate representation for aggregated pluggable content that exists on top of the pluggable data sources and that exists underneath the content tree nodes that implement the interfaces through which the data is actually exposed.
Let's go back to the sorting problem and consider a hypothetical situation. Suppose we decide we want to cache the sort relationship on our data, so that we don't have to continually resort it as the user hits the column headers in the tree. What we really need in this situation is the ability to take our intermediate representation and form an entirely different set of connections between our data objects. We need the ability to use arcs of a different type to connect our nodes, e.g., rather than chaining the nodes using a "natural order" relationship, it would be advantageous and desirable to be able to add an additional relationship to the nodes, e.g., a "sorted ascending on name" relationship. If we have something like this, then we can perform sorts without tearing down our intermediate representation AND without even losing our original information.
Another problem that arises in the tree view is the need to reorder columns. If a conventional tree structure is used as the intermediate representation of our data, then the column reordering could result in a potentially enormous and time-consuming walk through the tree in order to reorder the children of each item. But suppose that instead we could store additional information about the tree's columns, namely in which order they occur, then the act of persistently saving a column reordering would take far less time (O(1) to swap two columns, as opposed to a worst case O(n) where n is the number of cells in the tree).
Some of you might be saying, "Wait a minute. For sorting and column reordering, you have to rebuild the whole content model anyway! Why do you even need to change the intermediate representation of the data?" The answer is simple: persistence. Changes such as sorting and column ordering must be remembered across application sessions, and that means that these changes to the intermediate representation of data must be saved.
Statement #4: The intermediate representation of our data must be more flexible than a tree. It needs the ability to chain its nodes using a variety of distinct relationships in order to efficiently implement actions that permanently modify the data itself.
Last but not least, let's tackle local annotations and local/remote merging. Our architecture must be capable of applying local annotations to remote data, e.g., remembering that a button was removed from a toolbar or remembering the order of the columns in the bookmarks tree view.
This implies that changes that are made to our aggregated intermediate representation of data must be persistent. We must have the ability to add and remove nodes from the tree by saving the changes into the equivalent of a data source that can house the permanent changes (so that they can be sucked in and aggregated like everything else). This implies a need for the ability to make "negative" and "positive" assertions about connections in our tree, i.e., so that we can delete arcs and/or add arcs to the tree.
Statement #5: When a change is made to aggregated data that falls outside of the domain of an existing data source, it must be possible for that change to be persistently remembered by recording the change into a new data source that can then be read in when the data is re-aggregated in future sessions of the application.
The architecture that I have just described, the very architecture that I claim it is most desirable to have in order to implement our required feature set, is a combination of XUL and RDF.
The XUL/RDF Architecture
If you've read this far, you should now have a general idea of what the architecture is like, as well as the motivations for choosing such an architecture. Now it's time to fill in some of the details by mentioning XUL and RDF specifically. A picture (provided by Chris Waterson) of the XUL/RDF architecture is shown below.
Let's start over on the left side of the picture. A XUL document is read into Gecko's parser, and a specialized content sink, known as the XUL content sink, is responsible for constructing the in-memory RDF graph representation of the XUL.
The RDF graph represented by the XUL is then aggregated with the contents of other RDF stores (like bookmarks or history) to construct a composite data source, which is aggregated data still in an RDF graph form.
The resultant aggregated graph is fed into the XUL content model builder. This code is responsible for lazily presenting content nodes based on the RDF graph structure to the application that requests those nodes. Since this presentation happens "on demand", no content node is instantiated until the application specifically requests it, or demands an operation that requires the instantiation of the node to successfully complete, e.g., asking for the number of children of a content node.
When the application makes changes to the DOM, those changes percolate down into RDF, which can then decide what it should do with the changes in question. The most common options that RDF has to choose from are as follows:
Referencing local data: The LOCALDATA attribute
[TODO: Outline the naming scheme for local data. Give an example once I actually
know what the naming scheme will be.]
Denoting persistence of local annotations to the AOM: The PERSISTENT attribute
One subtree nested inside another subtree can specify a different value for the persistent attribute, thus allowing the XUL writer to specify a default for the whole window, but to still selectively override it for certain subtrees.
Note that even if the persistent attribute is set to false, that changes can still be made to a window's content tree. They simply won't be remembered across sessions.
Preventing the Persistence of Attributes: The DISCARDABLE attribute
The discardable attribute takes as its value the attribute name that should be non-persistent. Within the subtree at which the discardable attribute occurs, the attribute in question will be considered to be non-persistent.
The discardable attribute is ignored when used inside a non-persistent subtree.
Note that the persistent and discardable attributes only apply when a change has been made to the composite data source that was not handled by another data source (e.g., bookmarks). If you delete a bookmark, that's always going to be persistent, regardless of what you set these attributes to be. The attributes in question only apply to local annotations that are made outside of the domain of any particular data source.
[TODO: More examples]
[TODO: Talk about data sources]
|Copyright © 1998 The Mozilla Organization.|