Electronic Book Technologies: DynaText "Free text" (Header)

From comp-text-sgml.4977@naggum.no Sat Jun 18 04:25:32 1994
Received: from naggum.no (naggum.no []) by maud.ifi.uio.no ; Sat, 18 Jun 1994 09:19:02 +0200
Received: from archive.naggum.no by naggum.no with SMTP
          id <AA27350>; Sat, 18 Jun 1994 07:17:25 UT
Newsgroups: alt.etext,comp.text,comp.text.sgml
Sender: comp-text-sgml.4977@naggum.no
To: comp-text-sgml@naggum.no
Date: 18 Jun 1994 04:48:57 UT
From: Jon Bosak <novlepub@netcom.com>
Organization: Novell Electronic Publishing
Message-Id: <novlepubCrKtDL.7Cn@netcom.com>
Keywords: dynatext novell sgml etext religion bible quran shakespeare
Subject: [4977] ANNOUNCE: Religious works, Shakespeare online
Status: RO

NOTE on Internet Location!

NB: Since the time of this posting [June 18, 1994], the location of the Freetext material has changed:. The texts themselves no longer reside on ftp.novell.com/pub/epub/[etc.] Instead, go to ftp://ftp.netcom.com/pub/novlepub/freetext/. Note that the docviewer programs, however, will probably still be on the site ftp.novell.com/pub/epub/[etc.]

For further description of DynaText and other SGML software products from Electronic Book Technologies (EBT), see the Electronic Book Technologies entry in the main document.

Text of the Announcement


Novell Electronic Publishing is pleased to announce the release of two
document sets designed to show how collections produced by third parties
can be seamlessly integrated into the new DynaText 2.2 environment used to
present Novell's technical documentation, such as the recently announced
electronic manual set for NetWare 3.12.  The two demo sets can be installed
either on a network server or on a local network client and can be
configured in a way that makes them available to all users on the network
or just to specific individuals.  If installed correctly, the added
collections will in every case appear to occupy the same "virtual library"
as other collections produced for DynaText 2.2.

The two collections made available with this announcement were prepared
from texts downloaded from the Internet.  They are:

   religion -- Four major religious works in English: The Old Testament,
	       The New Testament, The Book of Mormon, and the Quran

   shaksper -- The complete plays of William Shakespeare

Directions for obtaining these document collections are given in detail

The DynaText 2.2 viewer for MS-Windows replaces ElectroText, which was
Novell's customized version of DynaText 1.5.  DynaText 2.2 features an
improved user interface, simplified administration, support for public and
private annotations, and a data format that allows a single set of document
files to be displayed across multiple NetWare clients.

   *************************** IMPORTANT ***************************
   The DynaText 2.x document format is not compatible with the old
   ElectroText viewer distributed with NetWare 3.12, NetWare
   4.0/4.01, and certain other Novell products.  If you have
   previously installed Novell ElectroText documents, you must keep
   those documents and the ElectroText viewer installed on your
   system in order to read them.  These other documents will
   eventually be replaced by versions compatible with the new
   viewer technology.

The demo sets made available with this announcement can be freely
distributed internationally, and their use is unrestricted.  Electronic
Book Technologies retains complete rights to the DynaText viewers.  The
DynaText viewers made available by Novell can be distributed and used only
for the purpose of viewing documents published by Novell (such as these
demo collections).

Obtaining the demo collections

The demo collections can be obtained from ftp.novell.com in the following


They can also be obtained from ftp.netcom.com in these directories:


Given their general usefulness, we expect that these collections will also
become available at Novell mirror sites and other public repositories.

Since the document and viewer files are intended to be installed at the
root level of a NetWare or local client drive, you may find it convenient
to change to the root level of the drive on which you intend to install the
documents before you download them.

Download the document and viewer files as follows:

1. ftp ftp.novell.com

2. Give your name as "ftp" (without the quotation marks) and enter your
   email address when prompted for a password.

3. Enter the following commands at the ftp prompt:

   cd /pub/epub/docview
   mget *
   cd /pub/epub/freetext/religion_1.00
   mget *
   cd /pub/epub/freetext/shaksper_1.00
   mget *
   cd /pub/epub/unzips
   mget unzip*

   You can, of course, download just one of the sets by executing just the
   mget command for that set.

4. Follow the directions in the file instwin.txt to install the DynaText
   viewer for MS-Windows.  (Viewers for Macintosh and UnixWare will be made
   available shortly.)  Note that both the viewer and the document sets are
   intended to be uncompressed with the unzip program downloaded as part of
   the sequence of commands given above, not with the commercial PKUNZIP

5. The compressed binary file containing each document collection has been
   split up into pieces of about 1.44 MB each to facilitate downloading.
   Assemble the compressed files with the Unix commands

   cat religion.* > religion.zip
   cat shaksper.* > shaksper.zip

   These operations should yield a file named religion.zip of 9537092 bytes
   and a file named shaksper.zip of 11490421 bytes.  You can also use the
   DOS COPY command with the /B option to perform this function, but you
   will probably have to create several intermediate files to do so.

6. Follow the directions in the file instdoc.txt to install each
   collection.  When installed, the religion set occupies about 21 MB of
   disk space and the shaksper collection occupies about 24 MB of disk

   As you will see from the directions in the instdoc.txt file, nothing
   prevents the installation of one or both collections on a local hard
   drive.  Due to their size, however, most users will wish to install them
   as shared resources on a network server.

NOTE: Some sites will experience problems in downloading large files over
the Internet.  If you have set "bin", "hash", and "prompt" according to the
directions above and you still can't transfer the files successfully, then
there is something wrong with your system configuration.  Contact your
system administrator for assistance in adjusting your system's spool size
or timeout parameters.  Novell cannot assist you in troubleshooting ftp
transfer problems.

If you experience problems in installing the document sets themselves,
however, please feel free to contact Novell Electronic Publishing at the
address given at the end of this announcement or through the Usenet news
group comp.sys.novell.  We are extremely short-staffed and cannot always
respond immediately, but we do monitor the newsgroup periodically looking
for questions and comments that have subject lines relating to our online
publications, and we will do our best to help you.  We regret that we
cannot respond to inquiries regarding Novell's other products or services.
We are dedicated solely to the support of Novell's structured electronic
publications and have no information regarding other Novell departments.

Technical notes on the demo collections

The religion and shaksper sets were created from public-domain ASCII files
downloaded from the Internet in 1992 and marked up in SGML (Standard
Generalized Markup Language, ISO 8879) as prototype exercises in SGML
conversion.  A simple ad-hoc DTD was created for each collection and
tagging was performed using perl followed by manual cleanup using emacs.
In the case of the Quran, extensive spell checking was needed to correct
numerous errors apparently caused by imperfect OCR processing; the perl
vspell script was used for this purpose.  However, none of these documents
received true human copy editing, and no claim is made for editorial
correctness in any of them.

In the earliest realizations of these collections, simple stylesheets were
created using just emacs and a knowledge of the DynaText stylesheet
specification language.  In late 1993, these stylesheets were rearchitected
to enable shared use across prototype Windows, Unix, and Macintosh viewing
environments.  While more complex than some stylesheet implementations, the
style directories provided with the demo collections are much simpler than
the ones provided with Novell's NetWare manuals and may serve as learning
materials for DynaText implementors and power users interested in
developing control over online document formatting.  Users brave enough to
experiment with the style files will discover for themselves the meaning of
the often-stated "separation of form and content" made possible by SGML.
Users interested in SGML itself can examine the source markup by
highlighting sections of text and using the "Copy SGML" option to copy the
marked-up text to the Clipboard and thence to an editor window.

A word about disk space.  The philosophy adopted in our structured
electronic publications is based on an explicit trade-off of disk space for
functionality, recognizing that the cost of disk space continues to fall
while the need for better access to data continues to rise.  This
philosophy was taken a step further in the process of indexing these two
demo sets.  Unlike the word indexes for most document sets, the indexes for
the religion and shaksper collections were purposely created without a stop
list, which is a set of common words such as "or" and "the" that are
usually left unindexed to save space.  This has resulted in an increase of
roughly 30 percent in the installed size of each set.  In return, the user
has gained the ability to perform exact searches on common phrases such as
"the salt of the earth" and "to be or not to be", which would otherwise
generate a warning message and cause the replacement of stop-list words
with wildcards (*) in the search string.  This treatment was considered
more appropriate for texts such as these in which certain phrases have
become fixed in the language and will probably be searched on as a unit.

To make best use of this feature, you should enclose phrases in quotation
marks to prevent confusion between common words such as "not" and reserved
words in the query language.  Also, note that DynaText interprets groups of
characters in query strings as attempts to match on whole words rather than
substrings, and that wildcards follow Unix rather than DOS conventions.
Thus, in searches within a single book, a search on "man" will match only
on the word "man"; a search on "*man" will match on all words ending in
"man"; a search on "*m[ae]n" will match on all words ending in either "man"
or "men"; and a search on "*m[ae]n*" will match on all words that contain
either the substring "man" or the substring "men".  Click "Help" on the
DynaText menu bar for more information on searching.

Known bug: Certain complex regular expressions do not behave the same in
collection-level searches as they do in book-level searches.

|  Jon Bosak     Novell Electronic Publishing     novlepub@netcom.com  |