[This local archive copy mirrored from the canonical site: http://www.bitstream.com/nudoc/nudoctb.html, text only; links may not have complete integrity, so use the canonical document at this URL if possible.]

Bitstream logo   NuDoc Technology Brief


Bitstream® is actively developing a technology called NuDoc(TM), an advanced document composition engine. NuDoc treats a document as an object, in the object-oriented sense of the word. Leveraging object-oriented technology, NuDoc is a reusable building block for document processing applications. NuDoc SDK object classes provide an application programming interface (API) that supports the import, editing, display, or printing of electronic documents. One of the strengths of NuDoc is its ability to dynamically create layout intensive pages from separate content and style file imports.

Topics in this Document:



Overview

In NuDoc, a document object is made of style, content, and page layout sub-objects. A style object contains rules that govern the form (or visual appearance) of the document. Content elements such as words, images, movies, etc. are organized into a tagged tree structure that represents the logical organization of the information (sections, sub-sections, etc.). The W3C's extensible markup language (XML) is the default content data representation.

Styles are represented by a set of model objects. NuDoc uses a new style file format called Template Style Language (TSL) to represent the model objects. The TSL styles describe the colors, fonts, and geometric rules that govern how structured content is formatted into its visual appearance. The TSL uses a flexible container metaphor to describe how to adjust the sizes and positions of text, images, and other containers to result in a well designed page. The result of composing the structured content against these models is called a page layout.


Dynamic Page Generation

NuDoc is well suited to the dynamic generation of pages from personalized and/or time-varying content. Examples include on-the-fly generation of PostScript or the dynamic generation of personalized Web pages. NuDoc can be used as the basis for publishing server applications - formatting information into a variety of output formats on demand.


Shared Content Capability

NuDoc can support multiple documents (potentially displayed simultaneously in different windows) some or all of whose content can be shared. This sharing is achieved by allowing the content structures of each displayed document to contain references to content elements in an external content structure. Content from this shared external source is ingested into the structure of the documents being composed. Edits made to the shared content can then be extracted back out into the shared structure. This mechanism allows not just the style, but even the structure of a document to be applied dynamically.


Layout Model

NuDoc evolved from the composition needs of short documents with highly-designed pages. These kinds of documents require arbitrary two dimensional layout, fractional degree rotation, run-arounds, and so on. Version 1.0 of NuDoc does not contain support for footnotes, automatic section numbers, cross-references, and other long document features.

In NuDoc, page layouts are comprised of a tree structure of layout objects. In contrast to traditional page composition systems, where containers are of a fixed width and height, NuDoc containers can have dimensions specified as "minimize" or "maximize."

Containers can either be layout or content driven. In the layout driven case, a container searches the content stream to find matching content (such as a price or a block of text) to put inside itself. In the content driven case, tags in the content identify a model which in turn may specify that a container (or even a structure of objects and containers) should be created. By combining and nesting layout objects with these different capabilities, pages can be constructed which know how to flex and stretch in order to continue to be well designed in the presence of variable content.


Support for Authoring Applications

NuDoc-based authoring applications interact with NuDoc by sending messages to open, edit, format, display, or output a document. NuDoc's 300+ function API provides complete access to all of the content, style, page layout data structures and their associated properties.

NuDoc supports fully interactive authoring applications using a direct-manipulation, WYSIWYG interface. It provides "hit detection" methods, editing methods (e.g., insert character, move image, etc.), and provides incremental formatting and redisplay. It can either be used as the text formatting component inside of a larger application, or it can be used as the framework for an entire application.


Authoring Application Architecture

Shown below is a typical architecture of a NuDoc-based application. The application is responsible for providing user-interface and application-specific logic. The memory state of the document is managed and modified by NuDoc. NuDoc renders resulting pages into one or more of the application-defined windows.

NuDoc process chart

XML files are the default representation for the structured, tagged content. NuDoc reads and writes XML content files during the authoring process. NuDoc's TSL files are used to define the style sheets. For rapid saving and restoring of the entire document object's state, NuDoc can save to disk (and re-read) the post-composition document containing the style, content, and resulting page layouts (including all user edits and mark-up) to an external XML-format checkpoint file. Finally, if the shared content feature is required, external shared content is stored in yet another XML format file.



NuDoc 1.0 Feature Set (Available Q1 '98)

The NuDoc API supports importing content, defining and importing styles, navigation and editing of all sub-objects of the document, and incremental screen display and output.


State-of-the-Art Functionality
  • Compatible with industry standard XML content
  • Fast, incremental WYSIWYG display
  • Desktop Publishing functionality including sophisticated text composition
  • Platform Neutral: NuDoc runs on Windows NT, Unix, and MacOS

Template-Based Styles
  • TSL templates are built up from model definitions that generalize the traditionally text-based concept of styles and applies them to all types of page elements
  • Named model styles at page, shape, paragraph, and character level
  • Over 100 attributes that control all aspects of composition and appearance
  • Models can inherit and override other models to an arbitrary depth
  • Aggregate models can contain named variable parts to support variable data publishing

High Level API
  • Object-oriented C++ API with 12 interface classes
  • Full navigation of all document data structures
  • Full editability of all in-memory content, model, and page layout objects

Documents
  • Open multiple documents simultaneously
  • Create multiple documents with different styles from the same content
  • Combine content with template-based styles to create documents automatically
  • Share content across multiple, simultaneous documents

Selection and Editing
  • Geometric selection of objects and ranges
  • Automatic highlighting of selections
  • Cut, copy, and paste object and range selections
  • Navigate by characters, words, lines, or paragraphs
  • Move, stretch, rotate, layer, align, or group Layout objects
  • Sophisticated control over how page elements "snap" to one another

Style/Model Services
  • Apply and re-apply simple or complex aggregate models
  • All models are fully editable via the API

Viewing
  • Arbitrary zoom, pan, and scroll
  • Automatic redisplay of damaged areas
  • Multiple views on the same document

Page Elements
  • Text containers
  • Images (most formats)
  • Boxes
  • Ellipses

Paragraph-level Features
  • Arbitrary line spacing (leading), fixed or flexible
  • Extra space before or after a paragraph
  • Paragraph alignment (right, left, center, justify)
  • Hyphenation
  • Left indent, right indent, first line indent
  • Parameterized justification (word and letter spacing control)
  • Tabs

Character-level Features
  • Any font, color, point size
  • Bold, italic, underline
  • "Set size" (stretch or condense characters)
  • Support for TrueDoc portable fonts
  • Baseline offset (raise or lower text from baseline)
  • Pair kerning, manual kerning
  • Special spaces (em, en, thin, non-breaking, fill)
  • Manual kerning
  • Initial drop cap

Shapes
  • Placement can be fixed inside another shape, floating in a text stream, or relative to other shapes in a shape sequence
  • Borders (any width and color)
  • Fill with solid color or tiled image
  • External margins
  • Rotate any shape
  • Dashed/dotted borders

Shapes as Text Containers
  • Nested containers, both fixed and floating
  • Text flows between shapes (e.g., multiple columns)
  • Internal margins (left , right , top , bottom)
  • Text can be set to avoid other shapes
  • Text can flow into circles or polygons
  • Container dimensions can automatically expand or shrink to dimensions necessary to contain sub-objects
  • Sophisticated vertical justification

Output
  • CMYK PostScript
  • PDF

A NuDoc-based Application

PageFlex is a new product currently under development. Its focus will be as a variable data front-end solution for driving digital presses. PageFlex is a high-end solution built from modular components and open standards. It is the first solution to use XML as the intermediate data format between databases and the page composition process. The output formatter is based on Bitstream's revolutionary NuDoc page composition engine, NuDoc offers unprecedented control over the graphic design of page templates while maintaining a strict separation of form from the input XML content. We expect to begin beta testing of PageFlex in Q3 '98.

For more information about NuDoc send E-Mail to Paul Trevithick. Or, contact Bitstream using one of the telephone numbers given below.

Return to top



Search Search   Bitstream home Bitstream home   Mail E-mail info@bitstream.com

Phone   Worldwide
617-497-6222
  In the U.S. and Canada
800-522-3668
  In Europe
+31 20 5200 300

Bitstream Inc., 215 First Street, Cambridge, MA 02142 U.S.A.
©1997 Bitstream Inc. All rights reserved.