[This local archive copy mirrored from the canonical site: http://www.irisa.fr/ep98/tutorials.html; links may not have complete integrity, so use the canonical document at this URL if possible.]

EP98
EP98 TUTORIALS
TUTORIELS de la conférence EP98
Saint-Malo, 1-4 April, 1998



Tutorial EP1 Introduction to SGML and XML
Tutorial EP2 Advanced XML
Tutorials EP3 and EP4 DSSSL & XSL
Tutorial EP5 Encoding Information for Interchange: An Introduction to the TEI
Tutorial EP6 Mapping Websites and Creating Site Maps
Tutorial EP7 Website Information Architecture
Tutorial EP8 Document Image and Content Analysis
Tutorial EP9 Du plomb aux pixel : 1) gravure des poinçons
Tutorial EP10 Du plomb aux pixel : 2) dessin de caractère par ordinateur

Tutorial EP1:
Introduction to XML

Date
Tuesday 31 March, 1998, morning.
Title
Introduction to XML
Author
Murray Maloney
Muzmo Communication Inc., Pickering, Ontario, Canada
Murray Maloney is an independent consultant and the proprietor of Muzmo Communication Inc. He is a founding member of numerous W3C Working Groups, including XML and XLL, CSS, HTML, DOM, WAI, and Metadata. Murray is co-author, with Yuri Rubinsky, of "SGML on the Web: Small Steps Beyond HTML". Prior to opening his own consultancy, Murray was Technical Director at SoftQuad where he had responsibility for HoTMetaL and Panorama, and Publishing Systems Architect with the Santa Cruz Operation (SCO) where he helped design the world's first HTML-based online documentation and context- sensitive help system. He is a technical consultant to the Yuri Rubinsky Insight Foundation, a member of the International World Wide Web Conference Committee, and co-chair of the 8th International WWW Conference (Toronto, 1999).
Contents
Extensible Markup Language (XML) is emerging as the strategic technology for building structured information applications. In this half-day tutorial, instructor Murray Maloney will explain why XML is such an important advancement and how it is being used by in the development of small- and large-scale structured information applications across manufacturing, telecommunications, computer software and other industries. The course includes an overview of the XML specification, a history of its development, and an update on the most recent news from the XML Developer's Conference (Seattle, March 23-26).
Registration
Registration fees are 660 FRF.
Registration has to be made to EP98 secretariat, preferably using the EP98 registration form.

Tutorial EP2:
XML: Up close and personal

Date
Wednesday 1 April, 1998, morning
Title
XML: Up close and personal
Author
Murray Maloney
Muzmo Communication Inc., Pickering, Ontario, Canada
See Tutorial EP1 above
Contents
Extensible Markup Language (XML) is a subset of Standard Generalized Markup Language (SGML) that is poised to supplant the Hypertext Markup Language (HTML) as the lingua franca of the World Wide Web. In this half day tutorial, instructor Murray Maloney will present the XML technical specification. The course provides practical explanations of "valid" and "well-formed" documents, the document character set, white space handling, and the syntactic construction of XML documents. The instructor will also share late-breaking news on two related specifications: XML Link and XML Style.
Registration
Registration fees are 660 FRF.
Registration has to be made to EP98 secretariat, preferably using the EP98 registration form.

Tutorials EP3+EP4:
DSSSL & XSL

Date
Tuesday 31 March, 1998
Title
An Introduction to DSSSL and XSL (morning)
The Practice of DSSSL and XSL (afternoon)
Author
Ken Holman
Note
This is two half-day seminar.
People may registrate separately to one or both tutorials.
Who should attend?
This tutorial is specially dedicated to people needing to learn how to program in DSSSL and XSL as well as people needing to understand the role these standards play in document engineering.
Prerequesite:
Knowledge of SGML concepts and syntax and a Windows-based computer (public domain software supplied)
Contents

The first module is an introduction to the concepts, components and syntax of the ISO standard Document Style Semantics and Specification Language (DSSSL - ISO/IEC 10179) and related Extensible Stylesheet Language (XSL - W3C). Also included is the relationship of these standards to the Standard Generalized Markup Language (SGML - ISO 8879) and Extensible Markup Language (XML - W3C) families of standards.

The second module builds on the knowledge of DSSSL and XSL concepts with hands-on exercises to practice the techniques. The exercises include the use of both standardized style semantics and custom SGML transformation semantics.

Registration
Registration fees are 660 FRF for each seminar. Lunch may be ordered. Registration has to be made to EP98 secretariat, preferably using the EP98 registration form.

Tutorial EP5:
ENCODING INFORMATION FOR INTERCHANGE:
An Introduction to the TEI

Date
Tuesday 31 March, 1998, afternoon
Title
ENCODING INFORMATION FOR INTERCHANGE: An Introduction to the TEI
Author
Lou Burnard, European editor of the TEI Guidelines, who is currently Manager of the Humanities Computing Unit at Oxford University Computing Services.
Lou Burnard (http://users.ox.ac.uk/~lou/) was educated at Balliol College Oxford, from which he graduated with a first class degree in English in 1968. He has worked in computing applications in the Humanities since 1974, with extensive experience in database systems, information retrieval, and text encoding on a wide variety of computer systems. In 1976, he set up the Oxford Text Archive, an early version of what we now know as the digital library; in 1987 he was appointed European editor of the TEI Guidelines. He has played a major role in the creation of several major digital initiatives in the UK, including the Arts and Humanities Data Service, and the British National Corpus, and various European initiatives; his most recent publication is "The BNC Handbook" to be published by Edinburgh University Press in 1998.
Who should attend?
This tutorial is specially dedicated to reserachers and editors working in the field of the scholarship and hmanities texts ...
Contents
Since their first publication in 1994, the Recommendations of the Text Encoding Initiative have had an extraordinary influence on the divers communities of people creating, using, and curating digital resources of all kinds, serving as an important reference point even for projects which have not adopted them. Some indication of the breadth and variety of the community of TEI users is given by the TEI applications web page at http://www-tei.uic.edu/orgs/tei/, which lists applications in digital library creation, language corpus construction, language engineering, document production, and text-centred humanistic research of all kinds, on both sides of the Atlantic and beyond.
This workshop will review the theoretical bases of the TEI encoding scheme, in particular its attempts to harmonize the widely divergent practices of computer-aided research which now crosses many political, linguistics, temporal, and disciplinary boundaries; the TEI Guidelines were designed for the widest possible range of applications, including natural language processing, information retrieval, hypertext, electronic publishing, various forms of literary and historical analysis, lexicography, etc. The Guidelines are for use with texts, written or spoken, in any natural language, of any date, in any genre or text type, without restriction on form or content. They treat both continuous materials (running text) and discontinuous materials such as dictionaries and linguistic corpora. As such, the TEI Guidelines offer the most general encoding solution currently available for the development of digital libraries, where varied and complex texts must be stored and manipulated in ways that answer a wide variety of user needs, and where the linkage of multi-media is essential.
To achieve this generality, as far as possible the Guidelines eschew controversy; where consensus has not been established, only very general recommendations are made. The object is to help the user make his or her position explicit, not to dictate what that position should be.
Viewed as a standard, the TEI scheme attempts to occupy the middle ground. It offers neither a single all-embracing encoding scheme, solving all problems once for all, nor an unstructured collection of tag sets. Rather it offers an extensible framework containing a common core of features, a choice of frameworks or bases, and a wide variety of optional additions for specific application areas. Somewhat light-heartedly, we refer to this as the Chicago Pizza model (in which the customer chooses a particular base -- say deep dish or whole crust -- and adds the toppings of his or her choice), by contrast with both the Chinese menu or laissez-faire approach (which allows for any combinations of dishes, even the ridiculous) and the set meal approach, in which you must have the entire menu.
Unlike some early monolithic applications of SGML, the TEI scheme was thus designed from the first as a modular scheme. With the advent of XML, and the concomitant widespread take-up of the basic principles of SGML (descriptive markup, reusable encodings, application-specific tagging etc.) the benefits of this approach are becoming increasingly apparent. The Workshop will describe the process by which application-specific views of the TEI scheme may be constructed, and their benefits in facilitating the distribution and conservation of digital information. Participants will also be introduced to some specific TEI applications, including TEI DTDs for use in digital libraries, language corpora, manuscript description and variation, lexicography, and document production. Emphasis will be placed on the ways in which the Guidelines offer potential for immediate application of new technologies such as XML, and its hyperlinking facilities XLL, which are derived largely from the TEI's extended pointer scheme. Facilities permitting, some TEI-aware tools and software will also be demonstrated
Registration
Registration fees are 660 FRF.
Registration has to be made to EP98 secretariat, preferably using the EP98 registration form.

Tutorial EP6:
Mapping Websites and Creating Site Maps

Date
Wednesday 1 April, 1998, morning
Title
Mapping Websites and Creating Site Maps
Author
Paul Kahn (Dynamic Diagrams)
Object
As web site have grown as a publishing media, we have struggled to find the equivalent of an index or table of contents for a web site. A web site is not a book, a magazine, a virtual city or a file system. What kind of diagram can visually represent a website? How does the visitor find his way around without a map?
Contents
This seminar covers The seminar will include demonstrations and comparison of the latest software tools for web masters and for visitors to visualize websites.
Registration
Registration fees are 660 FRF.
Registration has to be made to EP98 secretariat, preferably using the EP98 registration form.

Tutorial EP7:
Website Information Architecture

Date
Wednesday 1 April, 1998, afternoon
Title
Website Information Architecture
Author
Paul Kahn (Dynamic Diagrams)
Object
What is Information Architecture and why is it so important in planning a web site? Just as the architect coordinates the engineering, aesthetic, and functional needs of a physical building, the information architect works to develop the structural foundation and functional specifications of a web site.

This seminar will review the steps in planning and executing a sound information architecture for both public Internet sites and private corporate Intranets.

Contents
Topics will include: Examples will be drawn from a variety of public web sites and corporate Intranet applications.
Registration
Registration fees are 660 FRF.
Registration has to be made to EP98 secretariat, preferably using the EP98 registration form.

Tutorial EP8:
Document Image and Content Analysis

Date
Thurssday 2 April, 1998, morning
Title
Document Image and Content Analysis
Author
Thomas Bayer
Thomas Bayer was born near Nuernberg, Germany in 1961. He received the Dipl.Inform. degree in computer science in 1986 and his Ph.D. in 1993 both from the University of Erlangen-Nuernberg. In 1986 he joined the the Pattern Recognition Department of the Research Institute of AEG in Ulm, which is now Daimler Benz Research Center for Information Technology. Since then, he has been working in the domain of document analysis, particularly in contextual analysis, character segmentation and knowledge based document analysis. He is now responsible for the work in the domain of document understanding, information extraction and information filtering.
Who should attend?
Anyone who deals with large volumes of paper-based documents and their conversion into computer readable form for efficient storage, retrieval, and workflow integration would benefit from this course. The course would help both computer engineers who design document analysis systems and managers and system administrators who specify and evaluate such systems. Some familiarity with basic principles of image processing and artificial intelligence is helpful to fully understand all the concepts; however, advanced mathematical background is not required.
Topics
This course will enable the participant to:
Benefits
This course will enable the participant to:
Contents

Document analysis systems automatically extract information from scanned images of paper-based documents. Such systems recognize characters, establish spatial and semantic relationships, and determine the layout and logic structure. Their output facilitates efficient storage, retrieval, and subsequent processing of the documents, e.g. in a workflow.

The tutorial gives an overview about the general procedures used in the domain of document analysis and recogntion with a strong relationship to different applications and their specific problems, like address reading, form reading, free forms understanding, etc.

The tutorial starts with the introduction of different applications, particularly discussing the objectives of document analysis with respect to the application and essential technological issues. All processing steps for converting a scanned document image into data structures which convey its meaning are covered in this course: it starts with operations on the scanned image (pixel format), turns to segmentation issues, focusses on classification and discusses issues of the logic structure of documents (information extraction). The emphasis lies on techniques for recognizing objects (e.g. character classification, cut classifier,...) as a basic and multi-applicable technique, and information extraction for identifying the logic structure and the meaning of certain entities of documents. For all intermediate steps different approaches are described and their performance is assessed. In each step, results of these techniques are evaluated on examples from relevant applications and ope problems are discussed.

Registration
Registration fees are 660 FRF.
Registration has to be made to EP98 secretariat, preferably using the EP98 registration form.

Tutoriel EP9:
Du plomb au pixel : 1) la gravure de poinçons

Date
Jeudi 2 avril, 1998, matin
Titre
La gravure de poinçons
Animateur
Christian Papu
Christian Papu est graveur au Cabinet des poinçons de l'Imprimerie nationale à Paris.
Objet et contenu
À définir plus tard.
Inscriptions
Droits d'inscription : 660 FRF
Les inscriptions peuvent être prises auprès du secrétariat d'EP98. Les personnes inscrites à RIDT98 peuvent aussi s'inscrire à ce séminaire auprès du secrétariat de RIDT98.

Tutoriel EP10 :
Du plomb au pixel 2) dessin de caractères par ordinateur

Date
Jeudi 2 avril, 1998, après-midi
Titre
Dessin de caractère par ordinateur
Animateur
Jean-François Porchez
Jean-François Porchez est ...
Objet et contenu
À définir plus tard.
Inscriptions
Droits d'inscription : 660 FRF
Les inscriptions peuvent être prises auprès du secrétariat d'EP98. Les personnes inscrites à RIDT98 peuvent aussi s'inscrire à ce séminaire auprès du secrétariat de RIDT98.

Ph. Louarn - Jan. 1998 - © Irisa 1997-1998