Semantic Extensions to DSSSL to Handle Trees, or, Why Isn't DSSSL a Tree? from: http://cs.nyu.edu/phd_students/fuchs/DSSSL.sgm Semantic Extensions to DSSSL to Handle Trees <author prime="1"> <fname>Matthew <surname>Fuchs <address> <affil>Walt Disney Imagineering <aline>1401 Flower Street, P. O. B. 2502 <city>Glendale <state>CA <postcode>91221 <email>matt@wdi.disney.com <Abstract> <para> We consider the syntax and semantics of the <ACRONYM><TERM>TL</TERM><DD><PARA>Transformation Language</PARA></DD></ACRONYM> in the <ACRONYM><TERM>DSSSL</TERM><DD><PARA>Document Style Semantics and Specification Language</PARA></DD></ACRONYM> specification<bibref>DSSSL96</bibref>. At present <ACRONYM><TERM>TEs</TERM><DD><PARA>Transformation Expressions</PARA></DD></ACRONYM> are less than first-class language objects - they must all reside at the top level, and cannot be manipulated like other DSSSL/Scheme objects. In particular, there is no means of passing information among TEs, so one TE cannot take advantage of information derived by another, such as passing data about parent nodes to direct the transformation of child nodes. We propose extending the DSSSL syntax to allow a DSSSL program to better exploit the tree-like nature of the source grove by providing a semantics for nesting query expressions, allowing information to be passed around while retaining DSSSL's functional nature. The TEs would also come closer to being first-class objects. We suggest these extensions will make DSSSL programs easier to write and probably easier to optimize. </para> <biography> <bio> <fname>Matthew <surname>Fuchs <para> Dr. Fuchs is currently a member of the Research and Development staff at Walt Disney Imagineering where he has worked on information interchange for high-end VR, among other projects. Before that he was at WVU's Concurrent Engineering Research Center examining the use of SGML for exchange of structured information among computer and human agents in the Internet. Dr. Fuchs received his PhD. from NYU's Courant Institute in September, 1995. His dissertation, detailing his efforts in mobile distributed objects, distributed garbage collection, multi-threaded multi-user applications and the use of SGML for describing platform-independent user interfaces, won the department's award for dissertation of highest scholarly merit and has been nominated for the ACM's Doctoral Dissertation Award (and will have either won or lost by the time you read this). <Section><title>Introduction <para> The DSSSL TL is intended as the future standard means by which SGML documents are transformed, possibly replacing proprietary languages currently used for this. Although still unimplemented, we can, nevertheless, consider the impact its syntax and semantics, such as the decision to use a functional language, will have on building applications. Crucial to the TL are the Transformation Expressions (TEs) which select and transform nodes in the source grove. The DSSSL committee has chosen a syntax which forces all these TEs to occur at the program's top level and further limits them from being first class objects - they cannot be addressed or stored, for example. This flat architecture contrasts with the tree architecture of the document, or grove, being transformed, making it difficult to share information within the program. In particular, this might lead to bloated code, where the same information needs to be calculated several times. <para> Our own previous experience with the Scheme-based transformation of SGML documents suggests extending DSSSL syntax and semantics to allow the nesting of TEs, so information can be shared among the TEs during the transformation of a subtree in a grove. This will eliminate some redundant code and better show the relationships among transformations. Furthermore, if one were to attempt to optimize a program in the current DSSSL language, transforming the flat program structure into a tree structure, as we propose, would be a reasonable approach. <para> The rest of the paper will first discuss the drawbacks we see in the current DSSSL specification regarding this issue (we do not consider DSSSL's query language, style language, or many grove operators), then discuss an extended, tree-oriented approach, and finally show how this other approach can be reconciled with some of the other considerations of DSSSL. <section><title>Analyzing the Current DSSSL Transformation Language <para> There are three elements of the current version of DSSSL which we will discuss: <seqlist> <li><para>The flat nature of the TL syntax. <li><para>The functional nature of the TL. <li><para>The second class status of TEs. </seqlist> <para> In the current DSSSL specification, all TEs occur at the syntactic top level, leaving a completely flat structure. The desired semantic effect is equivalent to the following sequence: <seqlist> <li><para>All query expressions are applied to all nodes in the input grove independently, resulting in a list of node/TE pairs. <li><para>The list is sorted by node and transformation expression importance. If a node appears more than once on the list, all but the most important copy is dropped. <li><para>For each pair, the TE is applied to the corresponding node, resulting in a list of sub-grove plans. <li><para>The sub-groves are reassembled according to the subgrove expressions, resulting in the transformed grove. </seqlist> <para> This is not necessarily the manner in which a DSSSL program would actually be executed, but any actual execution must render equivalent results. The DSSSL syntax itself provides little alternative to this default semantics: <seqlist> <li><para>All transformation expressions are at the top level, so they are globally visible. There is no syntactic way to limit a TE's range by hiding it through encapsulation. <li><para>Because the language is functional, it provides no way to change the value of identifiers, so there is no way for query expressions to "communicate" with each other through shared state. <li><para>Transformation expressions cannot be assigned a name which is visible throughout the program, so they cannot be manipulated by a program during execution. </seqlist> <para> As many DSSSL operators are linear or quadratic in the size of the document or of the DSSSL program indicates that a naive execution strategy could be very inefficient. Evaluating all queries on all nodes, when most nodes only fit a small subset, is already quite profligate, but a query may include operations might examine a set of nodes proportional to the log of the document size, such as number of parents of a node, or event proportional to document size, such as how many siblings a node has. Furthermore, as there is no syntactic way to preserve these answers across queries, a naive implementation will need to recalculate them repeatedly, resulting in even worse performance. It is fair to say that the efficiency of DSSSL will rest heavily on the optimizations which can be brought to bear on the query expressions. However it is not just the single query expressions which need to be optimized individually, it is their combinations as well. Otherwise performance will be only marginally improved. (In all fairness, it may be that pathological structures appear rarely in documents intended principally for human consumption, but we need to also contemplate more complex structures passed by computational agents.) <para> Because of the piecewise independence of each TE and node, DSSSL prescribes no particular path through the source grove from root to leaf, and it is not necessary to have the grove for an entire document on hand to parse some small section of it. The former property allows a DSSSL program to make headway while the database loads the entities of a document in any order, the latter allows the database to load only the portion necessary for the task at hand. In a language with mutable variables, such as C/C++ or regular Scheme, the TE for one node might alter some variable used by another, implying the order in which nodes are visited can affect the outcome, which would negate the first advantage, and an explicit traversal path would reduce concurrency and could require certain nodes to be visited even if they were not of interest. <para> The efficiency of this approach depends heavily on the locality of the query and transformation parts of the TEs. The greater the locality (i.e., the fewer, if any, other nodes referenced) the simpler each query will be; the worse the locality, the greater the number of other nodes referenced, and the more complex each expression will be. Where there is bad locality, the expression will not terminate until other nodes from the grove are loaded to be examined by the program. For example, if a node needs to know its nesting level, several ancestors may need to be examined to determine this single value, reducing concurrency. Because of DSSSL's functional nature, however, once this is done, the derived information is lost to other TEs (and even to the same TE examining another node), and must be regenerated again. <para> If one were to try to optimize this whole process, there would be two primary approaches: <seqlist> <li><para>Where there is a database, optimize the database to handle particular queries. This, however, can only be successful for particular DSSSL programs, and only for documents already present in the database. <li><para>Transform the flat program into a tree oriented one, where common expressions are pulled out of queries in a hierarchical fashion, so information about parent nodes is kept for the children. </seqlist> This second approach essentially means transforming the DSSSL program into a program similar to what we will describe next. </para> <section><title>Towards an Alternative Approach <para> We propose a different semantic architecture for DSSSL to avoid these difficulties while hopefully preserving the advantages. The hallmark of this architecture is to exploit the tree nature of the grove. <para> In our work with mobile objects<bibref>Fuchs1</bibref>, distributed multi-user graphical interfaces<bibref>Fuchs2,Fuchs3</bibref>, and agent communication languages<bibref>Fuchs4</bibref>, (all using Scheme extensively) we have needed to both manipulate concurrent nested hierarchical structures and parse SGML documents used for describing GUIs and passing information among distributed agents. These applications never aspired to have the exhaustive set of node operators in DSSSL but were analogous in the desire to generally handle parsed documents using a layer of Scheme. However their basic architecture is relevant. Beyond this basic architecture certain decisions on semantics are still open to a consensus of the SGML community. <para> For the sake of simplicity, we will assume, unlike DSSSL, that the entire document is present and it is examined in an in-order traversal. We will later show these assumptions can be dropped. We will also bend the terminology of our explanation to correspond to the new vocabulary introduced through the latest versions of the DSSSL and HyTime<bibref>HyTime</bibref> standards, rather than retaining the idiosyncratic jargon of old applications. </para> <para> The fundamental difficulty of DSSSL, as has been shown, is the inability to convey information from one TE applied to one node to another TE applied to another. This information could concern a node's siblings (what number section am I?), its ancestors (how far am I indented?), or even its descendants (how big will I be?), although the first two are by far the most common. This means most of the external information a TE needs for a node is derivable before the node is visited. Were we to adopt a simple <highlight style = ital> in order</highlight> traversal (recursively visiting each node, then its children subtrees in left to right order) with a traditional programming language, a model explicitly rejected by DSSSL, communicating this information would be simple. We could create a bunch of global variables to mutate during the transform; each expression could use these to transmit information around. Where information must percolate up (i.e., from child to parent) a "fixup" stage could be used after the subtree is parsed. However this approach suffers from over-reliance on global variables, which make programs brittle. We will overcome this problem by organizing the application in a tree similar to the grove to be processed. <para> In our previous applications, transformations were grouped hierarchically and lexically scoped. At the outermost level were default TEs, or TEs for nodes whose handling would be context insensitive (i.e., not dependent on history). The transformation expressions for these TEs had three parts: <seqlist> <li><para>The first part included node-specific processing and could create encapsulated state accessible for transforming the subgrove rooted there. Encapsulated state was created using the traditional Scheme mechanisms, such as <highlight style = bold>let</highlight>, <highlight style = bold>let*</highlight>, etc., but these would then enclose statements creating the next two parts. <li><para>The second (optional) part introduced a new set of TEs which would only apply to nodes in the subtree rooted at the current node. This set of new TEs were lexically scoped within the existing set of TEs. The set did not need to cover all possible child nodes - if no query expression at the innermost level applied to a particular node, then queries in outer scopes would be successively examined, until an applicable one could be found. <li><para>The third (optional) part included node-specific post-processing which would occur after the subgrove rooted at the current node was processed. This allowed information to trickle up to a node from its subtree. </seqlist> </para> <para> This three-part structure applied recursively; the TEs introduced in the second part were, themselves, structured in the same fashion and could include nested TEs, to any depth. A subgrove was traversed by, in order: <seqlist> <li><para>executing the startup code for the current node. <li><para>creating the nested set of new TEs within the environment created by the startup code. <li><para>traversing the subtree using the lexically nested TEs. <li><para>executing the cleanup code for the current node and returning any structures created. </seqlist> </para> <para> As examples, we will show two code fragments from earlier efforts. <sgml> --instruction: <prepend><path>^1, ["INSTRUCTION"<text initstr = "-height 1 -width 30" packstr = "-fill x" textstr = "INSTRUCTIONS:">]# <append><path>^1.INSTRUCTION,<ttext># --section: <append><path>TOPFR, [<attribute>NAME: <frame initstr = "-relief raised -borderwidth 3" packstr = "-fill x">]# {--heading:<append><path>^1,["HEADING"<label>]# <set><path>^1.HEADING,<ttext># --instruction: <append><path>^1, ["INSTRUCTION"<text initstr = "-height 1">]# <append><path>^1.INSTRUCTION,<ttext>#} </sgml> <para> This first example is actually part of an SGML document. <sgml>SHORTREF</sgml>s are used to give a <quotation>programming language</quotation> look and feel. The full document is a style sheet describing how to display documents conforming to an entry form DTD as a tree of <highlight style=bital>Tk</highlight> widgets. This style sheet would be pulled over the network. In particular, their are two separate ways to format the <sgml>instruction</sgml> element depending on whether it appears inside a <sgml>section</sgml> or not. Placing the second definition lexically inside the definition for a <sgml>section</sgml> automatically ensures the proper behavior. With DSSSL we would require a query expression to determine whether or not the element's parent is a <sgml>section</sgml>. Defining the treatment of the <sgml>heading</sgml> element within <sgml>section</sgml> allows the output of the <sgml>heading</sgml> transform to be immediately attached to the output of its parent, without having to search around the output tree. <verbatim> 1 (define date-parse (lambda (node path) (lambda (cpos taglist pobjs match-vec) (let ((date-pos (node 'get-child path)) 5 (my-objs (parent-objects pobjs))) (append-tree date-pos full-date-tree class-lookup my-objs) ((my-objs 'index 0) 'set-doc-loc cpos 'tree) (let ((tags 10 (list taglist def-proc (cons 'DATE rec-proc) (cons 'MONTH (tree-loc pcdata-leaf date-pos "full-date.month")) (cons 'DAY 15 (tree-loc pcdata-leaf date-pos "full-date.day")) (cons 'YEAR (tree-loc pcdata-leaf date-pos "full-date.year")) (cons 'HOUR (let* 20 ((place (tree-loc pcdata-leaf date-pos "full-date.hour"))) (lambda (cpos taglist pobjs match-vec) (make-tree date-pos clock-time-tree class-lookup) (place cpos taglist pobjs)))) 25 (cons 'MINUTE (tree-loc pcdata-leaf date-pos "full-date.minute"))))) (parser cpos tags my-objs match-vec)))))) </verbatim> <para> This second example describes parsing a data structure in an SGML document. Unlike the first example, where the style sheet was an SGML document interpreted by a Scheme program, this program is implemented directly in Scheme. <highlight style=bold>Date-parse</highlight> is a higher-order function that takes two parameters <highlight style=bold>node</highlight> and <highlight style=bold>path</highlight> and returns a <highlight style=ital>closure</highlight> taking four parameters. This closure is a TE in the system. The first three parameters are the most significant: <seqlist> <li><para><highlight style=bold>cpos</highlight> indicates the current node in the grove to be transformed. <li><para><highlight style=bold>taglist</highlight> contains a recursive set of lexically-scoped transformation expressions. The <highlight style=bold>tags</highlight> data structure created on line 9 creates a new set of transformation expressions to be passed to be used in parsing child nodes (line 27). <li><para><highlight style=bold>pobjs</highlight> is the list of objects created by the transformation of the parent node - essentially a list of pointers to certain parts of the output grove. </seqlist> <highlight style=bold>Date-parse</highlight> places a new subgrove structure somewhere in the output grove (not necessarily directly in its parent). The parameters to the closure indicate where this structure is to go. The exact location is placed in <highlight style=bold>date-pos</highlight> (line 4) and the structure (defined elsewhere) is created and inserted in the grove on line 6. <para> By placing a child TE in a closure, the parent can pass information about the current state of the transformation to the child (or potential child - the TE might not actually match any node). This is used extensively; the function <highlight style=bold>tree-loc</highlight> is also a closure which knows how to traverse a grove. If executed it will traverse the target grove and transfer the <sgml>PCDATA</sgml> from the source node to the target node. <para> A more sophisticated example is found in transforming the <sgml>HOUR</sgml> element, which is optional in the <sgml>DATE</sgml> content model. The TE's closure stores where the target subgrove will appear if it exists, and if an <sgml>HOUR</sgml> element appears, the TE will generate the subgrove. The <sgml>MINUTE</sgml> element is also optional, but will only be executed if there <sgml>HOUR</sgml> element. <para> The call to <highlight style=bold>parser</highlight> (line 27) actually traverses the subgrove. Any code after that line would constitute post-processing on the resulting subgrove. This particular example does not require that. <para> There are a number of important aspects to this code: <seqlist> <li><para>In previous efforts the queries (rather traditionally) were based on an element's GI, so inner TEs completely eclipsed outer TEs for the same element type. Given that DSSSL has rejected the limited tag-based model in favor of general queries this can no longer be generally true, but the nesting provides a more dynamic prioritization and ordering mechanism for the TEs than DSSSL currently provides <li><para>The model implies a top-down traversal of the grove (inner TEs are only available if outer nodes have been examined). If the entire grove is present, this is not a problem. Otherwise, we will need to determine how to pull in the required nodes. <li><para>TEs are not separated syntactically from the rest of the language. For example, a TE can be wrapped inside some enclosed environment, using a <highlight style = bold>let</highlight> or <highlight style = bold>letrec</highlight>. When applying queries or transformations to nodes, this environment can be shared among these functions. During the top-down descent, TEs can be parameterized with information about the document. Information such as nesting level can be passed recursively, rather than being derived statically and repeatedly. </seqlist> <para> The principle item missing from this model are recursive constructs for TEs to allow them to affect their own future behavior in a functional style. For example, if each node in a sequence needs to know something about its predecessors, a recursive construct would allow that information to be passed on without explicitly needing to set variables. A similar facility is provided by SGML's link facility <bibref>Goldfarb</bibref>. <para> We could further push this model to make TEs first-class class objects assignable to variables, allowing even greater modularity and sophistication. However doing so could make any eventual optimizations more difficult. </para> <section><title>Supporting Concurrency and Partial Groves <para> Extending DSSSL to use this recursive architecture will support more modular and compact programs and maps better to the tree-like structure of the groves being transformed. The next step is showing how it can be implemented to better support the other goals of DSSSL, concurrent transformations and partial document parsing. </para> <para> In considering concurrent transformations and partial document parsing, it is important to note: <seqlist> <li><para>Not all elements have semantic meaning on their own. Even in a DSSSL program, certain elements cannot be the root of a unit to be transformed, as they mean nothing outside of the surrounding context. For example, a list item is unlikely to be the only part of a document needing to be transformed at a particular point in time. From a DSSSL perspective, this means the subgrove created by transforming the node cannot be a grove root (or else it would be free-standing), but must always be a subgrove. <li><para>The ability to transform a partial document is strictly limited by the locality of the transformation. If the transformation requires information from nodes not in the subgrove under consideration, then additional parts of the document must be loaded. In this case, either the query or the transformation require the presence of additional nodes, which must be loaded at some point for processing to continue. Unlike the previous case, the TE handling such a node must be able to emit a grove root. </seqlist> <para> These points are closely related. In particular, an element without individual semantic meaning should not be transformed except within the context of that part of the document which provides that meaning, so that should be brought in as well. Likewise, an element whose transformation has little locality <highlight style = ital> requires</highlight> that additional information be brought in before the transformation can be attempted. In the first case, encapsulation might, in fact better describe the semantics of the document type being transformed. Encapsulating the query would show that it is dependent on its context and would prevent its being considered as a candidate for transforming the root of a grove. (From the point of view of the compiler, however, these can simply be ignored be inferring that they can never return a grove root.) <para>The second case is more complex, as there are two possible perspectives, depending on how we treat the relationship between the node under consideration and furthest ancestor required for the TE. I suspect the second approach will prove more useful, but I mention both as viable alternatives: <seqlist> <li><para>We make no consideration of the position of the TE in the program tree, but instead locate the appropriate TE (if any) recursively. Each candidate TE appears at some level of the tree. Given TEs of equivalent priority, we start with the outermost TEs and attempt to apply them to the node. If we succeed, we are done. Otherwise, we descend down the program tree to find the next TE which could apply, retrieving as many ancestor nodes as necessary, and then attempt to apply the nested query (of course the ancestor nodes must also fulfill the queries of the outer TEs). We do this recursively until we find an applicable TE. This approach creates an implied preference for TEs with greater locality. The difficulty would be in determining the fixed point, for each nested query, of how many additional nodes to retrieve. As we are interested in changes to the environment between the desired ancestor and the current node, this might include other descendants which affect the environment. Which nodes these are can be determined by program analysis. The positive element of this approach is it requires little or no additional code to handle document fragments, and the output is the same. <li><para>We require there be a top-level TE for any fragment which might be independently parsed. The query is free to examine any node in the grove containing the fragment, but the transformation expression must know which ancestor to start with. This could either be accomplished by directly passing an ancestor node to the transform (i.e., the transform takes, as an argument an ancestor node). This node could be found by evaluating a second query against the ancestors of the current node to determine which to use, along the lines of figure ? This approach has the advantage of better exposing the control structure, but would most likely lead to additional code, and the output would not necessarily be the same as when the node was embedded within the full grove. </seqlist> </para> <section><title>Conclusion <para> We have attempted to demonstrate the power and utility of an extension to DSSSL which takes advantage of the tree structure of the documents being transformed, based on previous experience with Scheme based parsing of SGML documents. These extensions do not add any basic power to DSSSL, as it is already Turing complete. We hope that these extensions will make it easier to write modular programs transforming one SGML document type to another. We believe there are also cases where this approach will be easier to optimize than the current syntax. </para> <rear> <bibliog> <bibitem> <bibref>Goldfarb</bibref> <pub> Goldfarb, Charles F. The SGML Handbook, Oxford: Oxford University Press, 1990 <bibitem> <bibref>DSSSL96</bibref> <pub>Information Technology -- Text and Office Systems -- Document Style Semantics and Specification Language (DSSSL), ISO/IED 10179:1996, http:// </bibitem> <bibitem> <bibref>Fuchs1</bibref> <pub>Fuchs, Matthew, "Dreme: for Life in the Net", PhD Dissertation, New York University, 9/1995, http://www.cs.nyu.edu/theses </bibitem> <bibitem> <bibref>Fuchs2</bibref> <pub>Fuchs, Matthew. "The User Interface as Document: SGML and Distributed Applications." Computer Standards & Interfaces 18/1 (January 1996) 79-92 </bibitem> <bibitem> <bibref>Fuchs3</bibref> <pub>"Escaping the Event Loop: an alternative control structure for multi-threaded GUIs" in Unger, Carl and L. J. Bass, "Engineering for HCI" Chapman & Hall, 1996 </bibitem> <bibitem> <bibref>Fuchs4</bibref> <pub>"Let's Talk: Extending the Web to Support Collaboration", Proceedings of the 5th Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises (WET-ICE '96), IEEE Press, 316-321 </bibitem> <bibitem> <bibref>HyTime</bibref> <pub> </bibitem> </bibliog> </gcapaper>