                               September 22-26, 1996

Conference on electronic publishing and document manipulation

Principles of document processing

PODP96 and EP96 comprise a five-day technical gathering focused on recent
progress in electronic documents. PODP96, principles of document processing,
is a workshop devoted to examining the confluence of document processing and
computer science. EP96 continues the tradition of a general conference
devoted to electronic documents. Tutorials are offered and may be attended
by any participant.


Workshop on the principles of document processing
Monday, September 23

PODP96 will be the third in a series of international workshops organized to
promote the modeling of document processing systems using theories and tools
from computer science, mathematics, etc.  Areas of document processing
presented in the first workshop in Washington, DC were document formatting,
document conversion, document representation, document recognition, document
retrieval, and hypertext, among others. The second workshop in Darmstadt,
Germany concentrated on document databases. For more information please
contact Prof. Charles Nicholas (


Electronic publishing and document manipulation
Tuesday, September 24 - Thursday, September 26

This conference will be the sixth in a series of international conferences
organized to promote the exchange of novel ideas in the area of computer
document manipulation.  The first two conferences in the series, EP86 held
in Nottingham, England, and EP88 in Nice, France, concentrated mainly on the
specific aspects of electronic document  production, from composition to
printing.  EP90, which took place in Washington, DC, adopted a broader
definition of the term Computer Assisted Publication, and accordingly,
extended its range of topics to include hypertext and hypermedia systems,
document recognition and analysis, and application of database techniques to
document handling.  EP92, held in Lausanne, Switzerland, confirmed the trend
for documents to affect more and more areas in computer science. EP94, held
in Darmstadt, Germany, focused  explicitly on document representation,
transformation,   management and interpretation. EP94; EP96 follows this trend.


Sunday 22:     pm  3:00-6:00   Pre-Conference Registration (at Dinah's
                               Garden Hotel)
Monday 23:     am  8:30-12:30  EP96 Tutorials 1 and 2/PODP96
               pm  2:00-6:00   EP96 Tutorials 1 and 3/PODP96
Tuesday 24:    am  9:00-10:00  Invited Talk
                   10:00-10:30 Coffee Break
                   10:30-12:00 Session A: Structured Documents I
               pm  12:00-2:00  Lunch Break
                   2:00-3:00   Session B: Multimedia and Typecases
                   3:00-3:30   Coffee Break
                   3:30-5:00   Session C: Presentation and Representation
                   6:00-8:00   Reception
Wednesday 25:  am  9:00-10:00  Invited Talk
                   10:00-10:30 Coffee Break
                   10:30-12:00 Session D: Structured Documents II
               pm  12:00-2:00  Lunch Break
                   2:00-3:00   Session E: Document Analysis and Compression
                   3:00-3:30   Coffee Break
                   3:30-6:00   Exhibitions
                   6:30-10:00  Banquet
Thursday 26    am  9:00-10:00  Invited Talk
                   10:00-10:30 Coffee Break
                   10:30-12:00 Session F: Interfaces


Tutorial 1: Information Modeling Approaches to Meet New Publishing Demands
(2 parts) -- Lorraine Stanford

The advent of new publishing paradigms impacts on how information is created
and captured in today's organizations.  The need to create information once, and
leverage the reuse of it many times grows as stakeholders grapple with rising
costs of repurposing and republishing data.

Historical approaches to information management and modeling impact on an
enterprise's ability to meet these new publishing demands.  Traditional
requirements of paper publishing are changing to incorporate electronic
publishing methods including electronic books and the World Wide Web. Without
an ability to repurpose existing information into new uses, these new demands
cannot be satisfied economically, efficiently or expediently.  As organizations
move towards structured information creation and management, choosing which new
approach to adopt becomes a critical decision.

The use of the HyperText Markup Language (HTML) for the Internet specifically,
or the Standard Generalized Markup Language (SGML) for electronic documents in
general, provides mechanisms for reducing the costs of publishing information
in multiple ways.  The appropriate role for the Hypertext Markup Language
(HTML) in the Document Development Life Cycle is as a publishing medium and not
as a content capturing medium.  It is inflexible to lock one's information into
a single presentation medium, which is how the World Wide Web should be

This full day tutorial overviews how, when an organization's information is
structured using content-oriented markup languages defined using Standard
Generalized Markup Language (SGML), the information can be repurposed in
different domains as required.  From these information structures,
presentation-oriented paper deliverables can be created to meet traditional
publishing requirements. HTML, a presentation-oriented electronic markup
language, can also be one of many outputs for the information, that being
appropriate for global access in public or private Webs.

The tutorial includes a detailed case history describing the architecture of
the technical manual production system of a supplier of military equipment to a
Canadian Department of National Defence (DND) Project Office.  The Canadian DND
CALS Office Engineering and Technical Information Model is the structure of the
main store for information used in the production of technical manuals. The
nature of the structure of the information model is content-oriented, based on
the physical equipment breakdown structure (EBS).  The nature of the structure
of the technical manuals is presentation-oriented, based on a book paradigm.

The tutorial also includes a live example of an SGML Application for publishing
in three outputs: one Web style and two paper styles, each meeting a different
purpose for the information.

Completing the tutorial, pointers to publicly-maintained SGML resource
catalogues are reviewed.

Lorraine Stanford is a Senior Consultant with Microstar Software Ltd. Her main
responsibility is the development and delivery of Microstar's training
programme, which includes general SGML courses, as well as instruction in the
use of Microstar's software tools and the company's unique, practical approach
to implementing the Document Development Life Cycle. Ms. Stanford is also an
active member of Microstar's Professional Services Group, which enables her to
bring hands-on experience to the classroom. She joined Microstar in early
1994 following 14 years' experience providing Informatics consulting
services. Ms. Stanford holds Bachelor of Arts and Bachelor of Education
degrees from Queen's University at Kingston, Ontario.

Tutorial 2: Colour document display and reproduction -- Roger Hersch,
       Victor Ostromoukov, and Tim Kohler.

Color is an integral part of today's electronic documents. However, the
problem of faithful color reproduction on a variety of display and printing
devices incorporates many aspects, such as device calibration,
color halftoning and gamut mapping. The goal of this tutorial is to present
the scientific basis of colorimetry, the colorimetric behavior of scanners,
displays and printers, categories of halftoning algorithms specifically
conceived for color displays as well as current color management standards.

1. The basics of colorimetry
   From color matching experiments to the CIE-XYZ system
   Roger D. Hersch, Swiss Federal Institute of Technology, Lausanne

2. The colorimetric behavior of scanners, displays and printers
   Roger D. Hersch, Swiss Federal Institute of Technology, Lausanne

3. Halftoning for color displays
   (creation of color tables, dithering for display devices,
    halftoning in color space)
   V. Ostromoukhov, Swiss Federal Institute of Technology, Lausanne

4. The color management standards (ColourSync, ICC)
   Tim Kohler, Canon Information Systems


The conversion of documents into electronic form has proved
more difficult than anticipated. Document image analysis
still accounts for only a small fraction of the rapidly-
expanding document imaging market. Nevertheless, the
optimism manifested over the last thirty years has not
dissipated. There is increased emphasis on large-scale,
automated comparative evaluation, using laboriously compiled
test databases. The cost of generating these databases has
stimulated new research on synthetic noise models. Driven
partly by document distribution on CD-ROM and via the World
Wide Web, there is more interest in the preservation of
layout and format attributes to increase searchability and legibility
(sometimes called "page reconstruction") rather than just
text/non-text separation. At the same time, the requirements
of downstream software, such as word processing, information
retrieval and computer-aided design applications, favor
turning the results of the analysis and recognition into
some standard computer format. The realization that accurate
document image analysis requires fairly specific pre-stored
information has resulted in the investigation of new data
structures for knowledge bases and for the representation of
the results of partial analysis. Progress is reported on
documents - primarily office forms - containing a mix of
handprinted, handwritten and printed material, and research
on stylus-based data entry is spurred by the popularity of
notepad computers. Other active topics include image, text-image,
and text compression; map and line-drawing conversion; half-tone
and color processing; and text-entry for digital libraries.

EP96 The Conference Organization

Program committee chair: Anne Bru"ggemann-Klein (Technische Universita"t
Mu"nchen, Germany)

Conference chair:  Allen L. Brown, Jr. (Xerox Corporation, Palo Alto, USA)

Program committee:

Jacques Andre'         INRIA/IRISA, Rennes, France
Charles Bigelow        Stanford University, USA
David F. Brailsford    University of Nottingham, UK
Allen L. Brown, Jr.    Xerox Corporation, Palo Alto, USA
Heather Brown          University of Kent, Canterbuty, UK
Anne Bru"ggemann-Klein Technische Universita"t Mu"nchen, Germany
Giovanni Coray         Swiss Federal Institute of Technology, Lausanne, Switzerland
Anton Eliens           Free University, Amsterdam, The Netherlands
An Feng                Xerox Corporation, Palo Alto, USA
Hans-Peter  Frei       UBILAB, Union Bank of Switzerland, Zu"rich, Switzerland
Richard Furuta         Texas A&M University, USA
Roger D. Hersch        Swiss Federal Institute of Technology, Lausanne, Switzerland
Christoph Hu"ser       GMD-IPSI, Darmstadt, Germany
Rolf Ingold            University of Fribourg, Switzerland
Brian Kernighan        AT&T Bell Laboratories, Murray Hill, USA
Peter King             University of Manitoba, Canada
Dario Lucarella        CRA-ENEL, Milan, Italy
Pierre MacKay          University Washington, USA
Robert A. Morris       University of Massachusetts, Boston, USA
Makoto Murata          Fuji Xerox Information Systems, Kawasaki, Japan
Marc Nanard            CRIM, Montpellier, France
Vincent Quint          INRIA/IMAG, Grenoble, France
Richard Rubinstein 	Marcam Corporation, USA
Christine Vanoirbeek   Swiss Federal Institute of Technology, Lausanne, Switzerland
Hans van Vliet         Free University, Amsterdam, The Netherlands
Wang Xuan              Peking University, Beijing, China


Tuesday 24:    am  9:00-10:00  Invited Talk: Hal R. Varian (Professor
                                 and Dean of the School of Information
                                 Management and Systems at the University
                                 of California at Berkeley)
                   10:30-12:00 Session A: Structured Documents I
                               Web Applications and SGML
                                 by JACCO VAN OSSENBRUGGEN, ANTON ELIE"NS
                                    AND BASTIAAN SCHO"NHAGE
                               SGML/HyTime Repositories and the
                                 Object Paradigm
                                 by PATRICIA FRANCOIS, PHILIPPE FUTTERSACK AND
                                    CHRISTOPHE ESPERT
                               Typographic Sheets and Structured Documents
                                 by HE'LE`NE RICHY AND JACQUES ANDRE'
               pm  2:00-3:00   Session B: Multimedia and Typecases
                               Modelling Multimedia Documents
                                 by PETER R. KING
                               The Traditional Arabic Typecase Extended
                                 to the Unicode Set of Glyphs
                                 by YANNIS HARALAMBOUS
                   3:30-5:00   Session C: Presentation and Representation
                               A New Presentation Language for
                                 Structured Documents
                                 by ETHAN V. MUNSON
                               Pagination Reconsidered
                                 by ANNE BRU"GGEMANN-KLEIN, ROLF KLEIN AND STEFAN WOHLFEIL
                               Towards Structured, Block-Based PDF
                                 by PHILIP N. SMITH AND DAVID F. BRAILSFORD
Wednesday 25:  am  9:00-10:00  Invited Talk: Peter Hibbard (Principal
                                 Scientist, Adobe Systems, Inc.)
                   10:30-12:00 Session D: Structured Documents II
                               XTABLE---A Tabular Editor and Formatter
                                 by XINXIN WANG AND DERICK WOOD
                               Filtering Structured Documents in the
                                 SYNDOC Environment
                                 by E. KUIKKA AND A. SALMINEN
                               Automatic Generation of SGML Content Models
                                 by HELENA AHONEN
               pm  2:00-3:00   Session E: Document Analysis and Compression
                               Document Analysis of PDF Files: Methods,
                                 Results and Implications
                                 by WILLIAM S. LOVEGROVE AND DAVID F. BRAILSFORD
                               A Pattern-Based Lossy Compression Scheme
                                 for Document Images
                                 by QIN ZHANG AND JOHN M. DANSKIN
                   3:30-6:00   Exhibitions
Thursday 26    am  9:00-10:00  Invited Talk: Bryan L. Bell (Strategic
                                 Technologist, Frank ussell Company)
                   10:30-12:00 Session F: Interfaces
                               Using Documents as Interfaces to
                                 Information Systems
                                 by VIJAY KUMAR, RICHARD FURUTA AND ROBERT B. ALLEN
                               Retrieval from Facet Spaces
                                 by ROBERT B. ALLEN
                               The Stick-e Document: A Framework
                                 for Creating Context-Aware Applications
                                 by P. J. BROWN

General Information

Proceedings of the EP96 conference will be published by Wiley as a special
issue of the journal EP-ODD and will be available in preprint form at the
conference and in final form subsequently.  The PODP96 workshop proceedings
will be given to each participant at the beginning of the workshop.

Location: The conference will be held principally on the campus of the Xerox
Palo Alto Research Center (PARC) in Palo Alto, California. Palo Alto is
south of the San Francisco International Airport and can be reached in
approximately thirty minutes by automobile or any of several shuttle bus
services available at the airport.

Accommodation:  Rooms have been reserved at an attractive rate for
conference attendees at Dinah's Garden Hotel in Palo Alto. The rooms range
in price from $100-$120 per night of stay. That rate will be provided to
conference attendees beginning the night of September 21 and extending
through the night of September 28. Dinah's is approximately three kilometers
from the conference site. While it is a pleasant forty-minute walk from the
Dinah's to PARC, a shuttle bus service will be provided for the convenience
of the conference attendees. To reserve accommodations at Dinah's, attendees
should contact:

    Mr. Bill Lyons
    Director of Sales
    Dinah's Garden Hotel
    4261 El Camino Real
    Palo Alto, CA 94306
    +1 415 493-2844 (tel)
    +1 415 856-8904(fax)
    +1 415 856-4713 (fax)

Sponsors: The conference is sponsored by the Xerox Corporation,
Adobe Systems, Inc., the School of Information Management and
Systems of the University of California at Berkeley, and INRIA.

Insurance:  The organizers cannot be held liable to conference participants
for injury, damage or loss of their personal property.  It is suggested that
participants make their own insurance arrangements.

English:  English is the official language of the conference, tutorials and

Secretarial support including fax and phone will be available during the
whole conference period.

Registration:  Please make your (binding) conference reservation  by sending
the completed registration form with payment to the conference secretariat.
Confirmation will be given after receipt of the registration form.  For a
limited number of students conference attendance at a reduced fee is
possible.  Please send a copy of your student card.  Fees for the
conferences, workshops and tutorials include proceedings, the reception,
coffee breaks, lunches, and the banquet dinner.

Payment:  Payment should be made in United States dollars payable to the
Xerox Corporation by check.  Credit cards (Visa and Mastercharge) are accepted.

Cancellation:  Fees will be returned in full for any written cancellation
postmarked before September 1, 1996.  No refunds will be made after this date.

Registration forms should be sent to:  

                        EP96 Conference Secretariat
                        Mrs. P.A. Gretz
                        Xerox Corporation
                        3400 Hillview Avenue PAHV-127
                        Palo Alto, California 94304
                        fax: +1 415 813-7499

EP96  Registration Form

NAME: (please write last name first) _____________________________________

AFFILIATION: _____________________________________________________________

E-MAIL ADDRESS: __________________________________________________________

STREET ADDRESS: __________________________________________________________

TOWN/POSTAL CODE: __________________________  COUNTRY: ___________________

TELEPHONE: _________________________________  FAX: _______________________


Conference fees (excluding hotel room):

                before September 1      After September 1

                _______ $350            _______ $425

(student price) _______ $200            _______ $225

Tutorial fees (excluding hotel room):

               before September 1       After September 1

Tutoral 1:     _______ $400             _______ $500

   (student)   _______ $100             _______ $130

Tutoral 2:     _______ $200             _______ $250

   (student)   _______ $50              _______ $65

Tutoral 3:     _______ $200             _______ $250

   (student)   _______ $50              _______ $65

Attendees spouses and companions are welcome to attend the banquet event:

               _______ $30

Total amount due:

               _______ $


I am paying by: _____check _____VISA _____Mastercharge

credit card number_________________________________ exp. date_____________

Total amount due: ________________________________________________________

Date: ____________________________________________________________________

Signature: _______________________________________________________________

*An attendee who wishes to pay by credit card may return a completed and
signed copy of the above registration form to the secretariat via fax or post.
Registration forms should be sent to:   

                        EP96 Conference Secretariat
                        Mrs. P.A. Gretz
                        Xerox Corporation
                        3400 Hillview Avenue PAHV-127
                        Palo Alto, California 94304
                        Tel +1 415 813-7003
                        fax +1 415 813-7499