JUMBO and CML1.2 - Updates

Date: Thu, 06 Nov 1997 21:28:37
To: xml-dev@ic.ac.uk
From: Peter Murray-Rust <peter@ursus.demon.co.uk>
Subject: JUMBO and CML1.2


The latest snapshot of JUMBO (Java Universal (Molecular | Markup) Browser
for Objects) and CML1.2 (Chemical Markup Language) is available at:


JUMBO and CML are independent, but co-evolve. At present they don't have
versions, but have deadline-determined snapshots. That latest has been for
distribution on CDROM for the Royal Society of Chemistry as part of Henry
Rzepa's electronic conferences. (Henry and I are convinced that XML is the
right way for technical publishing to go - evangelism from members of this
list is taken for granted :-).

Amongst the distribution are:
	- a large number of CML examples
	- copious (if occasionally diffuse) HTML commentaries, tutorials, etc
	- JUMBO, aligned to act as an applet under a Java-enabled browser. I
have not tested it with MSIE4, but it works under NS4 (much better than 3). It
also works standalone with a Java interpreter. The applets *can* be viewed
over the net but downloads may take time.  Downloading the *.zip/tar.gz is
recommended. I *think* there are two up-to-date mirrors. (Anyone prepared
to mirror this distribution? :-)

JUMBO is NOT just for molecules, but reads PLAY and has (with some effort)
eaten its way through Macbeth. There are also examples of taxonomies and a
catalog of engineering materials data. The showpiece is a complete
scientific paper marked up with HTML, images, molecules, spectra,
crystallography, bibliography and XML:links. 

JUMBO has been tracking the recent specs and now has proof of concept for:
	- interoperating with Lark and NXP (but not yet Xapi-J - has any parser
writer yet implemented this?)
	- recognising namespaces, and linking to schema files
	- using the schema files for adding machine-readable semantics (Java
classes) on a per-element basis (thus <MOL> loads MOL.class at run-time).
	- displaying and editing:
		tree hierarchy
	- displaying (but not editing) mixed content as HTML. 
	- implementing almost all XML:link="SIMPLE". ('EMBED' is difficult for
tree-based display at present.)
	- proof of concept (pre-XML:link) of EXTENDED links
	- saving files in standalone mode
	- Almost full implementation of TEI Xpointers (what does 'SPAN' mean in a
	- implementation of proof-of-concept for resolution of semantics by
linking to Virtual HyperGlossaries. (Soon to be drastically modified for
the better with XML:link and XSL.)
	- on-the-fly conversion of legacy files into trees (and hence to XML).
About 15 legacy types from molecular science are covered. Others can be
hacked  (if the legacy files are easy to read :-) 

The later snapshot (not yet distributed) includes more use of schema files
and first steps in XSL. 

[Note: JUMBO is still JDK1.02 - I was waiting for full browser support.
Some windows do not always display gracefully and the scrolling is
horrible. I am not alone in these problems :-). JUMBO is slow for large
documents because it (a) creates fully subclassed objects for each node at
display time and (b) some of these objects have many data members. One
large todo is to devise a lazier model for processing and display of nodes.]

CML is a fully XML-compliant application with a minimal tagset designed for
maximum flexibility in prototyping molecular applications. It includes
generic support for technical data (not just chemistry) especially numeric
quantities with SI units ('metric') and others. Those in technical
disciplines may find it useful.

Several people have asked about the future of JUMBO and some have offered
to contribute :-). Ideally I would like JUMBO to evolve along the lines of
TeX/LaTeX or tcl/tk. The basis of these is a tightly controlled core with
extensions supplied by volunteers.  The results are freely available, but
not public domain (i.e. a GNU-like or slightly more restrictive license). I
was very impressed with the way that the tcl/tk project ran - equalled only
in my experience by the XML decision-making process.  

Because JUMBO tries to track the XML standards and because it is critical
not to have mutant implementations I am minded not to release source code
except to those actively involved in the development of the core.  With
Java this is an attractive option as it is a 'run-anywhere' option (Jumbo
is 100% pure). Moreover the discipline of writing for extensibility (i.e.
through interfaces and subclasses) is an extremely good one for both the
developers and the extenders. 

Note that the key aims of JUMBO do NOT compete with what other members of
this list are doing. JUMBO currently has goals like:
	- provide a sound core for building prototypes
	- be developed for pedagogy rather than performance
	- act as a demonstrator for the XML effort
	- help to explore problems in drafts of standards at an early stage.

It will not compete either with commercial browser/editor/transformers
(which are optimised for additional criteria such as performance,
interoperability with legacy systems, etc.) or with Amaya (the W3C's
reference browser).  Hopefully JUMBO will interoperate with all of these so
that extensions developed for JUMBO are transportable to more efficient

I'd be grateful for feedback on these ideas. If there is interest, please
let me know - it may take a little while to pull things together (Warning,
some of the code reflects the evolution of the specs, some my 'learning
curve' :-). If not, JUMBO will plod ahead when I have a few midnights free
and my laptop works. 

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms
Virtual Hyperglossary http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)