From comp-text-sgml.4977@naggum.no Sat Jun 18 04:25:32 1994 Received: from naggum.no (naggum.no [193.71.66.49]) by maud.ifi.uio.no ; Sat, 18 Jun 1994 09:19:02 +0200 Received: from archive.naggum.no by naggum.no with SMTP id <AA27350>; Sat, 18 Jun 1994 07:17:25 UT Newsgroups: alt.etext,comp.text,comp.text.sgml Sender: comp-text-sgml.4977@naggum.no To: comp-text-sgml@naggum.no Date: 18 Jun 1994 04:48:57 UT From: Jon Bosak <novlepub@netcom.com> Organization: Novell Electronic Publishing Message-Id: <novlepubCrKtDL.7Cn@netcom.com> Keywords: dynatext novell sgml etext religion bible quran shakespeare Subject: [4977] ANNOUNCE: Religious works, Shakespeare online Status: RO
NB: Since the time of this posting [June 18, 1994], the location of the Freetext material has changed:. The texts themselves no longer reside on ftp.novell.com/pub/epub/[etc.] Instead, go to ftp://ftp.netcom.com/pub/novlepub/freetext/. Note that the docviewer programs, however, will probably still be on the site ftp.novell.com/pub/epub/[etc.]
For further description of DynaText and other SGML software products from Electronic Book Technologies (EBT), see the Electronic Book Technologies entry in the main document.
=================================================================== FREETEXT DEMONSTRATION COLLECTIONS AVAILABLE IN DYNATEXT 2.2 FORMAT =================================================================== Novell Electronic Publishing is pleased to announce the release of two document sets designed to show how collections produced by third parties can be seamlessly integrated into the new DynaText 2.2 environment used to present Novell's technical documentation, such as the recently announced electronic manual set for NetWare 3.12. The two demo sets can be installed either on a network server or on a local network client and can be configured in a way that makes them available to all users on the network or just to specific individuals. If installed correctly, the added collections will in every case appear to occupy the same "virtual library" as other collections produced for DynaText 2.2. The two collections made available with this announcement were prepared from texts downloaded from the Internet. They are: religion -- Four major religious works in English: The Old Testament, The New Testament, The Book of Mormon, and the Quran shaksper -- The complete plays of William Shakespeare Directions for obtaining these document collections are given in detail below. The DynaText 2.2 viewer for MS-Windows replaces ElectroText, which was Novell's customized version of DynaText 1.5. DynaText 2.2 features an improved user interface, simplified administration, support for public and private annotations, and a data format that allows a single set of document files to be displayed across multiple NetWare clients. *************************** IMPORTANT *************************** The DynaText 2.x document format is not compatible with the old ElectroText viewer distributed with NetWare 3.12, NetWare 4.0/4.01, and certain other Novell products. If you have previously installed Novell ElectroText documents, you must keep those documents and the ElectroText viewer installed on your system in order to read them. These other documents will eventually be replaced by versions compatible with the new viewer technology. ***************************************************************** The demo sets made available with this announcement can be freely distributed internationally, and their use is unrestricted. Electronic Book Technologies retains complete rights to the DynaText viewers. The DynaText viewers made available by Novell can be distributed and used only for the purpose of viewing documents published by Novell (such as these demo collections). ------------------------------ Obtaining the demo collections ------------------------------ The demo collections can be obtained from ftp.novell.com in the following directories: /pub/epub/freetext/religion_1.00 /pub/epub/freetext/shaksper_1.00 They can also be obtained from ftp.netcom.com in these directories: /pub/novlepub/freetext/religion_1.00 /pub/novlepub/freetext/shaksper_1.00 Given their general usefulness, we expect that these collections will also become available at Novell mirror sites and other public repositories. Since the document and viewer files are intended to be installed at the root level of a NetWare or local client drive, you may find it convenient to change to the root level of the drive on which you intend to install the documents before you download them. Download the document and viewer files as follows: 1. ftp ftp.novell.com 2. Give your name as "ftp" (without the quotation marks) and enter your email address when prompted for a password. 3. Enter the following commands at the ftp prompt: bin hash prompt cd /pub/epub/docview mget * cd /pub/epub/freetext/religion_1.00 mget * cd /pub/epub/freetext/shaksper_1.00 mget * cd /pub/epub/unzips mget unzip* quit You can, of course, download just one of the sets by executing just the mget command for that set. 4. Follow the directions in the file instwin.txt to install the DynaText viewer for MS-Windows. (Viewers for Macintosh and UnixWare will be made available shortly.) Note that both the viewer and the document sets are intended to be uncompressed with the unzip program downloaded as part of the sequence of commands given above, not with the commercial PKUNZIP program. 5. The compressed binary file containing each document collection has been split up into pieces of about 1.44 MB each to facilitate downloading. Assemble the compressed files with the Unix commands cat religion.* > religion.zip cat shaksper.* > shaksper.zip These operations should yield a file named religion.zip of 9537092 bytes and a file named shaksper.zip of 11490421 bytes. You can also use the DOS COPY command with the /B option to perform this function, but you will probably have to create several intermediate files to do so. 6. Follow the directions in the file instdoc.txt to install each collection. When installed, the religion set occupies about 21 MB of disk space and the shaksper collection occupies about 24 MB of disk space. As you will see from the directions in the instdoc.txt file, nothing prevents the installation of one or both collections on a local hard drive. Due to their size, however, most users will wish to install them as shared resources on a network server. NOTE: Some sites will experience problems in downloading large files over the Internet. If you have set "bin", "hash", and "prompt" according to the directions above and you still can't transfer the files successfully, then there is something wrong with your system configuration. Contact your system administrator for assistance in adjusting your system's spool size or timeout parameters. Novell cannot assist you in troubleshooting ftp transfer problems. If you experience problems in installing the document sets themselves, however, please feel free to contact Novell Electronic Publishing at the address given at the end of this announcement or through the Usenet news group comp.sys.novell. We are extremely short-staffed and cannot always respond immediately, but we do monitor the newsgroup periodically looking for questions and comments that have subject lines relating to our online publications, and we will do our best to help you. We regret that we cannot respond to inquiries regarding Novell's other products or services. We are dedicated solely to the support of Novell's structured electronic publications and have no information regarding other Novell departments. --------------------------------------- Technical notes on the demo collections --------------------------------------- The religion and shaksper sets were created from public-domain ASCII files downloaded from the Internet in 1992 and marked up in SGML (Standard Generalized Markup Language, ISO 8879) as prototype exercises in SGML conversion. A simple ad-hoc DTD was created for each collection and tagging was performed using perl followed by manual cleanup using emacs. In the case of the Quran, extensive spell checking was needed to correct numerous errors apparently caused by imperfect OCR processing; the perl vspell script was used for this purpose. However, none of these documents received true human copy editing, and no claim is made for editorial correctness in any of them. In the earliest realizations of these collections, simple stylesheets were created using just emacs and a knowledge of the DynaText stylesheet specification language. In late 1993, these stylesheets were rearchitected to enable shared use across prototype Windows, Unix, and Macintosh viewing environments. While more complex than some stylesheet implementations, the style directories provided with the demo collections are much simpler than the ones provided with Novell's NetWare manuals and may serve as learning materials for DynaText implementors and power users interested in developing control over online document formatting. Users brave enough to experiment with the style files will discover for themselves the meaning of the often-stated "separation of form and content" made possible by SGML. Users interested in SGML itself can examine the source markup by highlighting sections of text and using the "Copy SGML" option to copy the marked-up text to the Clipboard and thence to an editor window. A word about disk space. The philosophy adopted in our structured electronic publications is based on an explicit trade-off of disk space for functionality, recognizing that the cost of disk space continues to fall while the need for better access to data continues to rise. This philosophy was taken a step further in the process of indexing these two demo sets. Unlike the word indexes for most document sets, the indexes for the religion and shaksper collections were purposely created without a stop list, which is a set of common words such as "or" and "the" that are usually left unindexed to save space. This has resulted in an increase of roughly 30 percent in the installed size of each set. In return, the user has gained the ability to perform exact searches on common phrases such as "the salt of the earth" and "to be or not to be", which would otherwise generate a warning message and cause the replacement of stop-list words with wildcards (*) in the search string. This treatment was considered more appropriate for texts such as these in which certain phrases have become fixed in the language and will probably be searched on as a unit. To make best use of this feature, you should enclose phrases in quotation marks to prevent confusion between common words such as "not" and reserved words in the query language. Also, note that DynaText interprets groups of characters in query strings as attempts to match on whole words rather than substrings, and that wildcards follow Unix rather than DOS conventions. Thus, in searches within a single book, a search on "man" will match only on the word "man"; a search on "*man" will match on all words ending in "man"; a search on "*m[ae]n" will match on all words ending in either "man" or "men"; and a search on "*m[ae]n*" will match on all words that contain either the substring "man" or the substring "men". Click "Help" on the DynaText menu bar for more information on searching. Known bug: Certain complex regular expressions do not behave the same in collection-level searches as they do in book-level searches. +----------------------------------------------------------------------+ | Jon Bosak Novell Electronic Publishing novlepub@netcom.com | +----------------------------------------------------------------------+