SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors
NEWS
Cover Stories
Articles & Papers
Press Releases
CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG
TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps
EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
|
| SGML/XML Bibliography Part 7, S - Z |
[CR: 19961009]
Sabasteanski, Anna. "Use of the Electronic Manuscript Standard at the New England Journal of Medicine." EPSIG News 2/1 (March 1989) 1-2. ISSN: 1042-3737. Author's affiliation: Medical Publishing Group, Massachusetts Medical Society.
The New England Journal of Medicine is a key publication of the Medical Publishing Group, Massachusetts Medical Society, and it plays part in the Society's adoption of SGML-based publishing technologies. Canonical database files using SGML encoding are used to produce different versions of the journal for CDROM, paper, and online access. The AAP's SGML DTD is the basis for the information structuring in the knowledgebase.
See the entry for Elsevier Scientific Publishers for other information, or the bibliographic entry for the Elsevier DTD documentation.
[CR: 19960812]
Sacks-Davis, R.; Arnold-Moore, T.; Zobel, J. "Database Systems for Structured Documents.." IEICE Transactions on Information and Systems E78-D/11 (November 1995) 1335-1342 (with 26 references). Authors' affiliation: Collaborative Information Technology Research Institute (CITRI), Carlton, Victoria, Australia Home Page Contact: Ron Sacks-Davis.
"Abstract: Documents stored in a database system can have complex internal structure described by languages such as SGML. How to take advantage of this structure presents challenges for database system implementers. We classify the types of queries that need to be supported by SGML conformant database systems. We then describe several data models that have been proposed for representing documents in a database system and discuss the support these models provide for SGML. Finally we consider query evaluation."
For further information on SGML-related research at RMIT/CITRI, see the main entry for RMIT - MDS.
Sacks-Davis, Ron; Arnold-Moore, Timothy; Zobel, Justin. Database Systems for Structured Documents. Technical Report. [Prepared for] International Symposium on Advanced Database Technologies and their Integration (ADTI'94), 1994.. Nara, Japan: [PUBLISHER?], 1994. 13 pages, 33 references.
Said, Carolyn; McManus, Neil. "SGML Standard Will Star in Boston Seybold Show." MacWeek 7/15 (April 12, 1993) 1,124. ISSN: 0736-7260.
Note on the prominence of SGML publishing technologies at the Seybold Seminars '93 trade show.
[CR: 19970817]
Salminen, Airi; Kauppinen, Katri; Lehtovaara, Merja. "Towards a Methodology for Document Analysis." Pages 644-655 (with 24 references) in Structured Information/Standards for Document Architectures. Edited by Elisabeth Logan and Marvin Pollard. = Journal of the American Society for Information Science, Special Issue. Volume 48, Number 7 (July 1997). New York: John Wiley & Sons Inc., 1997. ISSN: 0002-8231. Authors' affiliation: Departmenf of Computer Science and Information Systems, University of Jyväskylä, P.O. Box 35, FIN-40351, Jyväskylä, Finland. Email: airi@cs.jyu.fi.
Abstract: "A great deal of the collective knowledge of organizations is stored in documents. To be able to use documents effectively, the information structure in the documents should be carefully planned. International standards, for example SGML, have been developed for defining document structures. The definition method however is not enough. For defining effective document standards for an organization, a profound document analysis is needed. In the analysis, current documents and document management practices should be studied and described before developing new document structures and document management practices. The development of a methodology for document analysis is going on in a project studying legislative documents produced in the Finnish government and parliament. The article describes the first results of the project. As the document structure definition method, SGML is used in the project. The analysis method is developed and extended from an object-oriented method. The article introduces the main phases of the analysis: Domain definition, object modeling, state modeling, and content modeling. The application of the methodology in the case project and the data gathering methods used are also described."
See the main document entry for the complete list of articles and contributors, as well as other bibliographic information.
[CR: 19951110]
Salminen, Airi; Tompa, Framk Wm. "PAT Expressions: An Algebra for Text Search." Acta Linguistica Hungarica 41/1-4 (1992-1993) 277-306 (with 25 references). Authors' affiliation: [Salminen]Department of Computer Science, University of Jyväskylä; [Tompa] Department of Computer Science, University of Waterloo.
Summary: Text search operations are used to locate and retrieve needed information from some text collection. In traditional information retrieval, text search is a means for identifying relevant documents. By specifying selection criteria for the text of a document, the reader can choose a subset of a given set of documents. If the text collection is defined not as a set of documents, but more generally as a structure containing some parts, then text search involves the specification of those parts of interest to the reader.
The structure of the documents may be determined by the search system, by the author, by the text installer, or by the reader. In the PAT (TM) system, text search operations are expressions that efficiently combine traditional search capabilities with some new, powerful facilities. PAT contains means for lexical search, proximity search, contextual search and Boolean search. It also contains more rare operation types, including position and frequency search. Furthermore, a novel feature in PAT is the capability by which a reader can define structures for a text and use these structures in subsequent operations. One of the goals of this paper is to introduce the powerful search capabilities of PAT expressions.
Text search is usually considered so simple that only a rough description of the operations is given. For example, when word search is discussed, we are seldom told what is meant by a 'word'. The reader has to find out through experimentation how many words are contained in the strings 'Jean-Marie' and 'O'Hara'. However, a careless description of search operations may lead to search errors or unnecessarily long retrieval sessions. A second goal of the paper, therefore, is to introduce a mechanism for precise specification of text search semantics.
Text search using PAT is typically simple and straightforward. However, because of the powerful definition capabilities included in PAT, explaining and understanding the semantics of some operations may be difficult. As a side-effect of our systematic specification of PAT, we have identified some features of PAT expressions that cause problems and thus would benefit from further development. From this we see that precise specification also serves as a means for evaluation and offers a means for comparing text search systems. As is common in information retrieval systems, a PAT search is applied to indexed text. Indexing is usually described from the point of view of implementation, for example, by giving an algorithm for the indexing. However, since the way text is indexed affects search behaviour, our systematic approach to precise description must include mechanisms that accommodate indexing definition capabilities." [adapted from the Introduction]
The authors describe the query capabilities of the PAT system, dividing PAT expressions into six classes, and supplying a discussion of the syntax and semantics for each class. PAT indexing can be specified by productions as a view of PAT text.
Available on the Internet: ; [or mirror copy].
[CR: 19970106]
Sampson, Craig R. "SASOUT: A Context Based Table Model." Pages 235-264 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: SAS Institute Inc., SAS Campus Drive, Cary, North Carolina, USA 27513; Tel: (919) 677-8000 x-7417; FAX: (919) 677-4444; Email: sascrs@unx.sas.com; WWW: http://www.sas.com.
Abstract: "The SASOUT table model was developed to support the tabular documentation needs of the Publications Division of SAS Institute Inc. SASOUT instances contain sufficient meta information to allow them to be presented in both hard and soft copy. The meta data also makes possible non-traditional and interactive online presentations of the tabular data.
In 1995, research on tables produced by SAS software and on the tables previously used in our documentation resulted in our identification of four table types: simple, intersection, drill-down, and show-all. Imaging these tables on paper, as in the past, presented no significant problems even with SGML source data. However, we anticipated problems presenting our tables in soft copy after experimenting with the capabilities of the CALS table model, which was supported by our SGML software tools.
The CALS model does not support markup for indicating relationships between cells in a table nor directly support row header formatting. These relationships are not critical for producing hard copy, but are very important to our interactive online presentations. Header formatting is important for both hard copy and online presentations from a single source.
The SASOUT table model was developed to provide a means of marking up our tabular data while preserving its characteristics. The markup supports row headers and cell relationships in addition to all CALS features, such as column heads, spanning rows and columns, and alignment of data. The SASOUT model also supports behavior characteristics that allow the specification of online presentation methods.
This paper describes our table types, our platform presentation requirements, extensions we added to the CALS model, and the processing we designed to meet our formatting requirements so far.
The SASOUT DTD is freely available and we look forward to vendors providing support for it and other table DTD's that provide the means to fully identify tabular data."
The document is also available online in SGML format: see the download instructions from Craig Sampson, which contain the associated GCAPAPER DTD. URLs for the paper are: ftp://ftp.sas.com/incoming/sasout.tar.Z, (UNIX tar compressed) or ftp://ftp.sas.com/incoming/sasout.zip (.ZIP format); [UNIX format mirror copy] and [ .ZIP format mirror copy]. The SASOUT table DTD has been made available publicly by Craig Sampson on the Usenet News forum comp.text.sgml (CTS): see the local document. A related presentation describing the implementation of SGML by the Publications Division of SAS Institute was given at SGML '96 by Leonard P Olszewski, "Modular DTD Development and Maintenance at SAS Institute: Implementing an Efficient SGML System Using Software Engineering Principles."
Note: The above presentation was part of the "SGML User" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
[CR: 19961226]
Samuels, Eloise. "Case Study: Key Learnings from Converting Complex Technical Documents to SGML." Pages 57-64 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Senior Information Design Manager, Bellcore, 6 Corporate Place, PYA-1N158, NJ 08854, USA; Tel: 908-699-6853; FAX: 908-336-2605; Email: ews@rangers.lso.bellcore.com.
Abstract: "Unlocking the benefits of information in your documents may mean that you invest more than human resources, money and equipment. What about the pre-process planning time? Investment in Standard Generalized Markup Language (SGML) does not guarantee you immediate return and does not happen at the drop of a hat without a major management investment in pre-planning.
Because most corporations information management's top goal is to produce a much richer information environment, we must make a management commitment to a document analysis process. The process should identify what information in these documents is important enough to migrate to a rich electronic format such as SGML. It may seem obvious that the way to maximize your information is to break it into intelligible chunks in a data base. However, to get those chunks of information into a format that is acceptable by most applications is not a simple process. When that is complete, next comes the targeted conversion by document type.
For Learning Support the goal was to establish an information database that yielded benefits in the area of:
- document creation,
- document updating and revising,,
- database review and validation,
- information reuse, and,
- on-line full-text retrieval and distribution of information.,
The objective was to convert annually some 300,000 pages of technical documents containing complex tables and graphics from several different authoring environments into an industry standard Document Type Definition (DTD), called the Telecommunications Industry Markup (TIM DTD).
This industry standard format, Telecommunication's Industry Markup Document Type Definition (TIMDTD) is an explicit and neutral form of markup. The BCCs, in conjunction with the Telecommunications Industry Forum consisting of representatives from telecommunication vendors, such as Ericson, Siemens, AT&T, and Northern Telecom have unanimously endorsed it as their standard list of SGML markup tag definitions.
This paper identifies key learnings grasped from project management of the SGML Implementation Plan the Learning Support organization at Bellcore. Key outcomes determined were:
- Document analysis was critical to the success of the [project]
- The DTD writer's interpretation of the data and its structure required an iterative process with document developers and users. DTDs will change.
- It was important for acceptance to maintain the document developers view of the textual layout and format of the data while enforcing structure.
- Management's buy-in was needed at all points in the process
- Not everyone will be on board the train at the same time."
For more information on the TIM DTD as part of the TCIF/IPI (Telecommunications Industry Forum Information Products Interchange) standard, see the main entry in the SGML/XML Web Page.
Note: The above presentation was part of the "SGML Newcomer" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
[CR: 19951113]
Sandoval, Victor. SGML: un outil pour la gestion électronique de documents. Techniques de l'information. Paris: Hermes, 1994. Extent: 174 pages, bibliographie, index. ISBN: 2866014405.
[CR: 19971008]
Schatz, Bruce; Mischo, William H.; Cole, Timothy W.; Hardin, Joseph B; Bishop, Ann P. "Federating Diverse Collections of Scientific Literature." IEEE Computer 29/5 (May 1996) 28-36 (with 12 references). ISSN: . Authors' affiliation: Grainger Engineering Library and Information Center, Illinois University, Urbana, IL, USA.
"Abstract: The Digital Library Initiative (DLI) project at the University of Illinois at Urbana-Champaign is developing the information infrastructure to effectively search technical documents on the Internet. The authors are constructing a large testbed of scientific literature, evaluating its effectiveness under significant use, and researching enhanced search technology. They are building repositories (organized collections) of indexed multiple-source collections and federating (merging and mapping) them by searching the material via multiple views of a single virtual collection. Developing widely usable Web technology is also a key goal. Improving Web search beyond full-text retrieval will require using document structure in the short term and document semantics in the long term. Their testbed efforts concentrate on journal articles from the scientific literature, with structure specified by the Standard Generalized Markup Language (SGML). Research efforts extract semantics from documents using the scalable technology of concept spaces based on context frequency. They then merge these efforts with traditional library indexing to provide a single Internet interface to indexes of multiple repositories."
Available online in HTML format: http://computer.org/computer/dli/r500280/r50028.htm; [archive copy, text only].
Scheller, Angela. "Document Standards: Availability and Products." Computer Networks and ISDN Systems 16/1-2 (September 1988) 138-142. ISSN: 0169-7552. CODEN: CNISE9.
Abstract: With the growth in the spread of computer networks the demand by users for document interchange features is becoming increasingly apparent. The prerequirement for the realization of document interchange in a heterogeneous computer environment are internationally accepted standards for the description of documents. Already in early 1986, the Standard Generalized Markup Language SGML was published as an international standard for the structuring of documents. The publication of the Office Document Architecture ODA is expected in the course of 1988. The final text is already available. ODA was originally developed for the pure office environment, whereas the concept for SGML addressed the author/publisher environment. This fact is mirrored in the current pilot projects testing the standards: the manufacturers of office and word-processing systems mainly work with ODA, whereas in the technical scientific and publishing sectors SGML is often implemented. Users requiring an interface both to the office sector as well as to the publishing sector will therefore be confronted with the problems related to working with two different, only partially compatible standards.
Scheller, Angela. "Experience with SGML in the Real World: DAPHNE, a System Integrating Computer Graphics Metafiles into SGML Documents." In Document Exchange: The Use of SGML in the UK Academic and Research Community. Workshop Proceedings 5-7 March 1990. Edited by Anne Mumford. Advisory Group on Computer Graphics, 1990.
Abstract: DAPHNE is a document processing system implemented to support joint editing within the German Research Network DFN. It is based on two international standards in the area of document and graphics processing, the Standard Generalized Markup Language SGML and the Computer Graphics Metafile CGM. This paper presents the functionality offered by DAPHNE today as well as plans for future extensions. It also describes the experience gained with a distributed environment of commercial products for processing SGML documents in general and DAPHNE documents in particular.
Schettini, Stephen; Alschuler, Liora. "SGML is Here to Stay. Coding Documents with Standard Generalized Markup Language Lets You Manipulate and Format Text in Limitless Ways." Publish ? (June 1994) 71-78.
[CR: 19970212]
Schietekat, Raf. "DSSSL: The Promise FOSI Did Not Fulfill." In: Proceedings of the 3rd Annual Conference on the Practical Use of SGML. "A Decade of Power." Third Annual [Belux] Conference on the Practical Use of SGML. Business Faculty, Sint-Lendriksborre 6, Brussels, Belgium. October 31, 1996. Sponsored by SGML Belux (Belgian-Luxembourg Chapter of the International SGML Users' Group). Leuven, Belgium: Belux, 1996. Author's affiliation: Fotek NV, Entrepotstraat 3, B-9100 Sint-Niklaas, Belgium. Email: raf@fotek.com.
Abstract: "SGML (Standard Generalized Markup Language) is designed to encode information at the content level, abstracting away from formatting issues. In a well-designed SGML application, font details are not part of the SGML document, and contents may be rearranged or automatically generated. In this light, for professional purposes HTML (HyperText Markup Language) is better considered to be a presentation language for contents that have been stored separately (e.g. in a dedicated SGML environment) than a reliable repository data storage format in itself. This now probably well understood, at least in the SGML community. HTML is tied to a particular, very limited DTD (even ignoring the reality of emerging Microsoft and Netscape dialects), and requires independently specified semantics (i.e. how the Web browser has to interpret the HTML tags)."
"In general, SGML documents will be typeset using informal descriptions of the style semantics for the various elements that occur in a document instance, which requires good communication between the document publisher and the document typesetter. Once a system is set up to process a particular kind of SGML input, that style specification is generally not portable.
"The new ISO standard DSSSL (Document Style Semantics and Specification Language) aims to become the standard way of linking up SGML information containers with their graphical representation, as part of a suite of complementary ISO standards: SGML, HyTime, SPDL (Standard Page Description Language, based in part on the PostScript language from Adobe). This paper will elaborate on the details of how DSSSL achieves this, how it fits into a complete document production process, how detailed the specification is and what it leaves open, what other
functionality is available, and why it was worth waiting for."
Available online in HTML format: "DSSSL, the promise FOSI did not fulfill", by Raf Schietekat; [mirror copy]. For further information on the conference, see: (1) the description in the conference announcement and call for papers, and (2) the full program listing, or (3) the main conference entry in the SGML/XML Web Page.
[CR: 19971125]
Schiller, Jörg. "SGML and Development Documentation." Page(s) 159-160 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Project Manager, debis Systemhaus GEI, Ulm, Germany' Email: jschiller@gei-ulm.daimler-benz.com.
Abstract: "Increasing requirements in development documentation for automotive manufacturers such as the number of world-wide development sites (leading to the support of different languages); the speed of the development cycle; and the number of variants of products (re-use of base documentation), has led to the definition of exchange formats for information. This paper examines how SGML technology can be a good solution for problems in this area." [from the published program]
"The development of ECU (electronic control unit) for cars is a highly parallel and complex job. Requirements from different parts of an automotive manufacturer have to be fulfilled. The interfaces between correlated persons are not defined in a way, that an exchange of information is done easily. Business process reengineering activities discovered a big potential for enhancements by using SGML technology as an overall exchange format in these areas.
"Several projects were started, to implement a new process model. This article describes our experiences in projects we realized since the beginning of 1995. The biggest project deals with diagnosis data that is needed to describe parameters to communicate with ECUs. Today you can get many informations about the actual state of internal and external variables of ECUs (for example a coolant temperature). These informations are used to guide a diagnosis process to determine erroneous behaviour of components of a car. The diagnosis data is used in different parts of the company (development, production, service). Even companies that deliver ECUs can be involved in this process. We started a case study to determine the best format for the description of structures. As a result we decided to take SGML. The Document Type Structure (DTD) is presented to the ASAM/ASAP consortium for standardization. This consortium represents the German automotive industry, suppliers and tool companies.
"The system is now in use by 10 to 15 users and will grow in 1997 to 30 to 50 users. There is a process of standardization of diagnosis data in the moment in a consortium called ASAM/ASAP. Our DTD is a proposal to that committee and we think it will be fixed till the end of 1997. Our experience with the technologie SGML are quite good. We can transport the concept very easily to the users.
Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.
[CR: 19971125]
Schmitt-Rennekamp, Walter. "Digital Documentation Trends for Aircraft Maintenance." Page(s) 153-154 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Senior Consultant, Aircraft Maintenance and Engineering Documentation, Lufthansa Systems GmbH, Hamburg, Germany.
Abstract: "The aviation industry has a long tradition for information interchange standardization. The first generation of on-line documentation was a paper document duplicate based on SGML. In the future, documentation has to move from the document paradigm to an information paradigm. Then the user will get an 'Information Web' and exactly the information he is looking for. This presentation looks at the challenges and trends in aircraft maintenance documentation."
"In aircraft maintenance and operations documentation structure and form are well defined by the ATA SPEC 100 specification. It was a good foundation bringing that documentation into electronic form using SGML. SGML is today the foundation for ATA SPEC 2100, the aviation standard for electronic document interchange. [...] Tagged information at a well defined granularity makes incremental revisions easy. Taking advantage of the progress in electronic networks, an on-line document update will be possible and leads to totally new worksharing concepts between aircraft manufacturers product support organization and airline engineering."
Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.
[CR: 19961108]
Schopen, Michael (Dr. med). "Die logische Struktur der ICD-10 (Systematik) und ihre Beschreibung mit SGML [The Logical Structure of ICD-10 (Tabular List) and its Description with SGML]." Informatik, Biometrie und Epidemiologie in Medizin und Biologie 26/2 (1995) 121-133 (with bibliography and 4 pages of figures. Author's affiliation: Deutsches Institut für medizinische Dokumentation und Information (DIMDI) [German Institute for Medical Documentation and Information].
"Zusammenfassung: Das Deutsche Institut für medizinische Dokumentation und Information, DIMDI, ist beauftragt, die amtliche deutschsprachige Ausgabe der ICD-10 herauszugeben. Ausgehend von den Anforderungen an die maschinenlesbare Version der ICD-10 (Systematik) wird die Standard Generalized Markup Language (SGML) vorgestellt als ein Formalismus, mit dem die logische Struktur der Klassifikation beschrieben werden kann. Der im DIMDI für die ICD-10 verfolgte Ansatz - Reduktion auf die logische Struktur und Verzicht auf Layoutinformation - macht die maschinenlesbare Fassung unabhängig von spezifischer Hard- und Software und hät sie offen für unterschiedlichste Anwendungen. Die vorliegende Arbeit beschreibt in SGML die Grobstruktur der ICD-10-Kapitel sowie den Aufbau der Krankheitsklassen und einiger ihrer Elemente. An einem Beispiel wird gezeigt, wie die SGML- Daten für spezielle Anwendungen der Klassifikation restrukturiert werden können.
Abstract: The German Institute for Medical Documentation and Information, DIMDI, has been authorized to publish the official German language edition of ICD-10. [Internationale Klassifikation der Krankheiten = International Classification of Diseases, ICD.] Based on the requirements for a machine-readable version of ICD-10 (Tabular List), SGML - the Standard Generalized Markup Language - is introduced as a formalism to describe the logical structure of this classification. By specifying the mere logical structure and abandoning layout information, DIMDI's concept for ICD-10 makes the machine-readable version independent of any hardware or software and keeps it open to a broad range of applications. This paper uses SGML to describe the structure of ICD-10 - the chapters, the disease categories and some of their elements. An example is given how to rearrange SGML data for specific applications of the classification.
Die Arbeit erläutert den Ansatz des DIMDI, die ICD-10-Daten SGML-basiert zu bearbeiten.
Available in Postscript format on the Internet: ftp://193.174.240.221/pub/klassi/icdsgml.zip. Mirror copies: original from DIMDI; alternate - edited Postscript that worked locally. For more information, see the database entry for DIMDI.
[CR: 19961210]
Schouten, Han. "Documents in Databases." SGML Users' Group Newsletter 15 (January 1990) 8-11. ISSN: 0952-8008. Author's affiliation: Research Center for Technical and Physical Engineering in Agriculture (TFDL), The Netherlands.
The article explains "why documents should be stored in databases rather than in sequential files." [Because:] "Database technology provides us direct access to facts stored in a database. Here too the application-independent logical structure of information determines how we can get access to and process stored facts. The verification of manipulating such information according to its logical structure is, unless explicitly prescribed, not sequence-specific. Therefore, the storage of documents in databases seems to be the correct answer to our requirements of interaction with respect to document processing in the office environment." [extracted]
This article should be read in conjunction with a second article by Han Schouten, "Draft Tender Re: 'Documents in Databases'", also in number 15 of SGML Users' Group Newsletter.
[CR: 19961210]
Schouten, Han. "Draft Tender Re: 'Documents in Databases'." SGML Users' Group Newsletter 15 (January 1990) 12-14. ISSN: 0952-8008. Author's affiliation: Research Center for Technical and Physical Engineering in Agriculture (TFDL), The Netherlands.
A major draft proposal for SGML DSIG sponsored development of a prototype document processing environment in which documents are stored as databases. The environment would support SGML, but also other SGML-related standards like DSSSL -- "as an alternative for the sequential access strategy characteristic of standard SGML." Details on the objectives, tasks, funding, deliverables, rights and duties of participants, project management, (etc.) are described. Proposed tasks include specification of a gross system architecture, definition of modelling techniques, building and verifying semantic equivalence of all models with SGML and DSSSL, facilities for loading SGML DTDs, facilities to unload DTDs without loss of information, creation of a DTD editor, creation of a structured document editor, building of retrieval facilities, building a document formatter.
This document, as a draft tender, is to be read in conjunction with the companion article in issue 15 of the SGML Users' Group Newsletter, "Documents in Databases," also by Han Schouten.
[CR: 19961210]
Schouten, Han. "Meeting of the [SGML] Database SIG." SGML Users' Group Newsletter 15 (January 1990) 11-12. ISSN: 0952-8008. Author's affiliation: Ministry of Agriculture and Fisheries, Research Center for Technical and Physical Engineering in Agriculture (TFDL), Expert Center for Information Technology (ECIT), the Netherlands.
The article is a report on the meeting of the SGML Database SIG on October 26, 1989 at Alphen aan de Rijn, Netherlands. Presentations included: (1) Han Schouten, "The Storage of Documents in Databases at the Ministry of Agriculture and Fisheries"; (2) François Chahuneau -- experiences with the implementation of a document database for production of the Journal of the EEC, in nine languages, with an emphasis upon support for version management; (3) Ian Williams' presentation "An architecture for hypertext object management" -- this presentation focused on GUIDE, IDEX and OWL's hypermedia products in relation to SGML. OWL is researching SGML applications for information retrieval, object indexing and maintenance of database links; (4) Other meeting participants included Lou Burnard, Frank Dros, Harry Gaylord, Jurgen de Jonghe, Jan Maasdam, Hans Mabelis, Jon Maslin, Koen Mulder, and Gert van der Steen.
[CR: 19961210]
Schouten, Han. "SGML*CASE: The Storage of Documents in Databases." SGML Users' Group Bulletin 4/1 (1989) 1-14 (with 5 references). ISSN: 0269-2538. Author's affiliation: Ministry of Agriculture and Fisheries, Research Center for Technical and Physical Engineering in Agriculture (TFDL), Expert Center for Information Technology (ECIT); POB 356, Mansholtlaan 12, 6700 AJ Wagenhingen, The Netherlands. TEL: +31-8370-19143; FAX: +31-8370-11312.
Abstact: "Despite recent achievements in text editing, desktop publishing, and the hypermedia approach toward information processing, the developments in document processing remain in arrears when compared to data processing. This is highly remarkable, since today 99 percent of all information is still archived as documents on paper."
"Here we analyse the possible causes for this apparent backlog in document processing and the damage it inflicts on office automation. Hitherto the logical structure, the layout, and the presentation of documents have often been insufficiently distinguished. Documents are typically stored and accessed as sequential files. These characteristics strongly remind us of most data processing environments of about twenty years ago. Then, file structures were mainly application-dependent and files could only be processed in batch, because the possibilities for accessing their contents directly were absent. The information systems of those days featured all the bad qualities that most document processing systems feature today; many types of conversion from one application-dependent form to another, loss of information with these conversions, and the practical impossibility of managing stored information as a corporate resource. Conversely, many document processing applications such as document editing, hypermedia applications, and the integrated processing of data and text also require direct access to individual elements of stored documents."
"The logical consequence seems, therefore, to be to devise some application- and device-independent, directly accessible, storage facility for documents and to stimulate developments similar to those that caused data processing to become the success it is today. Building on the results of our analysis, we have made an attempt to store documents in a database and, consequently, have direct access to their structure and contents, maintain information integrity and optimally integrate data and text. A conceptual schema for the storage of documents is proposed here. The obvious advantages of the model are discussed, as well as the topics which remain to be investigated."
See the main entry for the SGML Database Special Interest Group (SGML DSIG/DBSIG) for further information. Note: The volume editor for SGML Users' Group Bulletin 4/1 is David W. Penfold (Edgerton Publishing Services, Huddersfield, UK).
[CR: 19970312]
Schouten, Han. "A Utility for the Combined Use of SGML and Ventura ®." SGML Users' Group Bulletin 3/2 (1988) 27-36 (with 4 appendices). ISSN: 0269-2538. Author's affiliation: TFDL/ECIT [Ministry of Agriculture and Fisheries, Research Center for Technical and Physical Engineering in Agriculture (TFDL), Expert Center for Information Technology (ECIT), The Netherlands].
The author explains a strategy used at the Dutch Ministry of Agriculture and Fisheries for converting SGML documents into Ventura documents for printing. The appendices contain examples of the SGML source code, the conversion scripts, and the corresponding representations in Ventura format.
Note: The volume editor for SGML Users' Group Bulletin 3/2 is Anders Berglund (ISO Central Secretariat, 1 Rue de Varambé, CH-1211 Geneva 20, Switzerland).
[CR: 19971125]
Schreier, Richard A. "Supporting SGML in Document Management Systems." Page(s) 95-101 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Director of Professional Services, Microstar Software Ltd, Nepean, Ontario, Canada; Email: ras@microstar.com.
Abstract: "Most Document Management System architectures can be categorized by the ability to handle and organize information of different kinds. Supporting information based on the Standard Generalized Markup Language (SGML) involves unique requirements that bear on the tasks of managing structured documents."
"This report overviews approaches to support SGML documents in a number of Document Management System architectures that were candidates to be used in an actual publishing system supporting the publishing and re-purposing of shared information for technical manuals. This publishing system supports content- and presentation-oriented SGML documents for a supplier of military equipment to a Canadian Department of National Defence (DND) Project Office."
This paper was originally prepared under a slightly different title by G. Ken Holman (Crane Softwrights Ltd.), formerly the Chief Technology Officer of Microstar Software Ltd. Slides for the related paper are among the collection of slide show presentations from Microstar.
Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.
[CR: 19980413]
Schroeder, Bethany. "HL7 Focuses on XML in New Orleans." XML Files: The XML Magazine
Issue 04 (March 17, 1998) 14.
A brief note on the January 1998 meeting of HL7's SGML/XML SIG, with updated on DICOM, KONA, and other health industry standards efforts.
Available online
[CR: 19971206]
Schroff, Thomas; Brüggemann-Klein, Anne. "Grammar-Compatible Stylesheets." Pages 51-88 (with 11 references) in Principles of Document Processing. Proceedings of the Third International Workshop. PODP '96, Third International Workshop. Palo Alto, California. September 23, 1996.. Edited by Charles Nicholas (Department of Computer Science and Electrical Engineering, UMBC, Baltimore, MD) and Derick Wood (Department of Computer Science, HKUST, Clear Water Bay, Kowloon, HONG KONG). Lecture notes in artificial intelligence. Lecture notes in computer science, 1293. Berlin / London: Springer-Verlag, 1997. ISBN: 354063620X. Authors' affiliation: [Schroff]: Technische Universität München; [Brüggemann-Klein]: Technische Universität München.
Abstract: "Stylesheets have been used to convert the document type of SGML documents. With a stylesheet, a document conforming to a source grammar can be transformed into a document conforming to a target grammar. The paper discusses the following problem: given a stylesheet, a source and a target SGML grammar, is it decidable whether or not all documents conforming to the source grammar are transformed into documents conforming to the target grammar? Using context free extended context free grammars we give a decision procedure for this problem."
Seaman, David "Campus Publishing in Standardized Electronic Formats -- HTML and TEI." Pages xxx-xxx in Filling the Pipeline and Paying the Piper: Proceedings of the Fourth Symposium [November 5-7, 1994, the Washington Vista Hotel, Washington, DC]. Edited by Anne Okerson, Symposium co-sponsored by the Association of Research Libraries and the Association of American University Presses in collaboration with the University of Virginia Library, the Johns Hopkins University Press, and the American Physical Society. Washington, D.C.: Association of Research Libraries, Office of Scientific & Academic Publishing, 1995. ISBN: 0918006252. Author's affiliation: David Seaman is the Director of the University of Virginia Library's Electronic Text Center.
"Introduction: In the past year, HyperText Markup Language (HTML) has done more to popularize the notion of Standard Generalized Markup Language than any single preceding use of SGML. Used on the World Wide Web through a graphical client such as Netscape or NCSA Mosaic, HTML documents and their associated image, sound, and digital video files result in sophisticated network publications and services. And even when viewed through the plain text (VT100) client Lynx, HTML files can still be exciting clusters of interlinked documents.
In common with Internet users all over the world, the University of Virginia Library now uses and produces HTML documents; unlike most other academic institutions, however, we came to HTML with practical experience in another, more sophisticated, form of SGML -- that of the Text Encoding Initiative Guidelines. For two years the Electronic Text Center has been using the TEI Guidelines, through several drafts, to tag and distribute hundreds of electronic texts. The purpose of this paper is both to explain how we are using these various forms of SGML mark-up to publish a variety of documents, and to sound a cautionary note about the wholesale use of HTML as a primary authoring language."
An online version ia also available at URL in HTML format, and in (only partially-linked) mirror copy here (May 1995). An abstract of the paper by Mary Mallery is available here.
[CR: 19971018]
Seaman, David. "The Electronic Archive of Early American Fiction (1775-1850)." Page 150 in ACH-ALLC '97. The 1997 Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing. Conference Abstracts. ACH-ALLC '97. Queen's University at Kingston, Ontario, Canada. June 3 - 7, 1997. Compiled by Greg Lessard and Michael Levison. Ontario, Canada: Queen's University, 1997. ISBN: 0-88911-760-8. Author's affiliation: University of Virginia, Email: .
[Extract:] "This 125,000-page project takes the University of Virginia Library into a level of archival-quality text and image production rarely seen in rare books archives. In preparing for this project we have tackled issues of funding, production-level digital equipment and practices, partnerships with commercial publishers to disseminate the results, and large-scale storage issues. This paper will outline the project, explain the workflow, equipment, and text and image standards that we think appropriate for creating data of long-term viability, and explore the lessons we are learning (and expect to learn) regarding the economics of undertaking a cost-recovery process. The project will combine high-quality color page images of all 125,000 pages (including covers and spines) with TEI-encoded text versions, allowing scholars all over the world a rare sense of the physical reality of the volumes being studied as well as providing a fully-searchable SGML database."
Abstract available online in HTML format: "The Electronic Archive of Early American Fiction (1775-1850)", by David Seaman; [archive copy]. See the Early American Fiction Home Page, or the main SGML/XML Web Page database entry for The Electronic Archive of Early American Fiction (UVA).
Additional information on the ACH-ALLC '97 Conference is available in the SGML/XML Web Page main conference entry, or [August 1997] via the Queen's University WWW server.
[CR: 19961217]
Seaman, David. "The Electronic Text Center and On-line Archive of Electronic Texts." Pages 55-57 in Elektronisches Publizieren und Bibliotheken, die Herausforderung neuer Partnerschaften [Electronic Publishing and Libraries, The Challenges of New Partnerships]. [Conference] 'Elektronisches Publizieren und Bibliotheken'. Bielefeld, Germany. February 5-7, 1996. Frankfurt am Main, Germany: Vittorio Klostermann, 1996. Author's affiliation: University of Virginia, Electronic Text Center.
Abstract: "The Electronic Text Center is both a physical space within the university library, open to all University of Virginia members and also an on-line collection of many thousands of Internet-accessible texts and images. It is important to us that we perform two tasks simultaneously in order to build our digital library: we are both creating a set of electronic resources, and also creating a user community for it, by training our users to become effective consumers and producers of electronic texts and images. Since 1992, the Etext Center has made available hardware and software for the creation and analysis of electronic texts; it provides training for these new tools and techniques; it acts as a focal point for HTML and SGML development in the humanities at Virginia; and it provides a place in which to use those texts that are not yet accessible on the Internet."
The document is available online: in HTML format; [mirror copy]. See also the main entry for the UVA Electronic Text Center.
Seaman, David M. "From Margin to Mainstream: Creating a Broad-Based Humanities Computing User Community at the University of Virginia." Pages 213-214 [partial abstract] in Colloque International "Consensus ex Machina?". Abstracts International Joint Conference of the ALLC (Association for Linguistic and Literary Computing) and ACH (Association for Computers and the Humanities), Sorbonne, Paris, 19-23 avril 1994. Paris: Laboratoire "Lexicométrie et textes politiques" (INaLF, CNRS), and Ecole Normale Supérieure de Fontenay - Saint Cloud, 1994. 244 pages. Author Affiliation: University of Virginia.
[CR: 19950716]
Seaman, David. "Gate-Keeping A Garden of Etext Delights: Electronic Texts and the Humanities at the University of Virginia Library." Pages 63-67 in Gateways, Gatekeepers, and Roles in the Information Omniverse: Proceedings of the Third Symposium. Third [ARL] Symposium, Washington Vista Hotel, Washington, DC, November 13-15.1933. edited by Ann Okerson and Dru Mogge. Washington, D.C.: ssociation of Research Libraries, 1994. Authors' affiliation: University of Virginia.
The paper discusses the use of SGML by the Electronic Text Center. "All the electronic texts are encoded with Standard Generalized Markup Language (SGML). The large-scale electronic text databases -- the OED, the Chadwyck-Healey items - come fully marked up, and increasingly we are seeing producers of individual titles (such as Oxford University Press) also offering them in SGML form. The SGML markup not only means that texts can be added together in conglomerations but also that the data, with all its structural and typographic information, is not inherently wedded to a piece of software. It is, in a real sense, data that will outlive the software we currently use to explore and present it."
Available online from the UVA WWW server.
Seaman, David M. "'A Library and Apparatus of Every Kind': The Electronic Text Center at the University of Virginia." Information Technology and Libraries 13/1 (March 1994) 15-19. 1 reference. Author affiliation: Coordinator of Electronic Texts, University of Virginia Library, Charlottesville, VA.
Abstract: The Electronic Text Center at the University of Virginia combines an online archive of thousands of SGML-encoded electronic texts, all available through a single piece of search software, with a library-based center housing hardware and software suitable for the creation and analysis of text. Through ongoing training sessions and support of individual teaching and research projects, the Center is now building a diverse and expanding user community locally, and providing a potential model for similar enterprises at other institutions.
[CR: 19950716]
Seaman, David. "The University of Virginia's Electronic Texe Center: An Interview with David Seaman." Virginia Librarian 39/2 (April/May/June) 6-10 (with sidebar: "Standard Generalized Markup Language"). Author's affiliation: David Seaman is Director of the Electronic Text Center, Alderman Library, University of Virginia, Charlottesville, Virginia.
"We are also concerned to maintain our on-line data in a standard tagged format-known as SGML, or Standard Generalized Mark-up Language-that will ensure that the electronic texts, with all their typographic, spacial, and structural instructions, will outlive the software we currently use to search and display them. . .The texts in our on-line collection are marked up with SGML tags that use letters and phrases within angled brackets to convey such information as structural divisions-title page, main body of text, scene, stanza, page, paragraph, etc. and typographical elements- changes in typeface, special characters, etc. . ."
Available online: http://www.lib.virginia.edu/etext/articles/VirgLib/virglib.html from the UVA WWW server.
[CR: 19971024]
Seaman, David. "The User Community as Responsibility and Resource: Building a Sustainable Digital Library." D-Lib Magazine ( ). ISSN: 1082-9873. Author's affiliation: Electronic Text Center, University of Virginia.
Summary: "Since opening as a full-time service in 1992, the Electronic Text Center at the University of Virginia Library has pursued twin missions with equal seriousness of purpose: (1) to create an on-line archive of SGML texts; (2) to build a community of humanists adept at the creation and use of online full-text resources. . . this article will focus on the integral place that our user community has in shaping the work of our library-based Etext Center."
See the Web site for The Electronic Text Center.
The article is available online in HTML format; local archive copy. Note that the July/August 1997 double issue of D-Lib Magazine (Amy Friedlander, editor) contains several articles referencing the use of SGML encoding in digital library research.
[CR: 19971201]
Selber, Stuart A. "First Commentary. The OHCO Model of Text: Merits and Concerns." Journal of Computer Documentation 21/3 (August 1997) 26-31 (with 21 references). ISSN: 0731-1001. Author's affiliation: Technical Communication and Rhetoric Program, Department of English, Box 43091, Texas Tech University, Lubbock, Texas 79409-3091; Email: selber@ttu.edu; WWW: http://english.ttu.edu/faculty/selber/vitae.html.
Abstract: "The author discusses the ordered hierarchy of content object (OHCO) model for text representation on the computer. [I have a concern about the OHCO model of text...] Although the model has explanatory power computationally, the way it defines what a text is, what a writer is, and what a reader is may serve to diminish, in potentially damaging ways, what is involved in the processes and practices of technical communication. But before discussing his concerns with the OHCO model of text, the author considers some of its merits, because he would not dismiss the model as invaluable to students and professionals. At times he found the model quite compelling, particularly in its focus on how text can be both productively and unproductively represented in online information space. [In fact, I plan on including this article in a graduate-level course I teach in technology and discourse.]"
The article is a response (commentary) on the publication of DeRose (et al), "What is Text, Really?" reprinted from Journal of Computing in Higher Education 1/2 (Winter 1990) 3-26.
This article appeared with four others in a special issue of JCD which focused upon 'the OHCO model of text [ordered hierarchy of content objects]'. The Journal of Computer Documentation (JCD) is a quarterly publication of the Association for Computing Machinery, Special Interest Group on Systems Documentation [SIGDOC], published by the Association for Computing Machinery. Editor in Chief: Tony R. Girill, Lawrence Livermore National Laboratory and University of California.
[CR: 19960716]
Sengupta, Arijit. "Demand More from Your SGML Database! Bringing SQL Under the SGML Limelight." <TAG>: The SGML Newsletter 9/4 (April 1996) 1-7, with 11 references. ISSN: 1067-9197. Authors' affiliation: PhD candidate at Indiana University, Department of Computer Science.
"Abstract: Have you ever been frustrated by how inadequate SGML databases are in terms of searching or querying your documents? With the current state of the art, you will easily be able to search for a word, phrase, or keywords in the whole document. Some systems let you perform approximate searches or regular expression searches. Even fewer systems let you search for keywords or phrases in certain SGML regions. However, there is much more information already in SGML documents that one can utilize cleverly to design a proper SGML database system. The current trend of modeling SGML documents with object-oriented and object-relational databases has certainly brought SGML closer to a complex object database model, but much research and development remains to be done in this area. This article introduces the popular relational database query language SQL (Structured Query Language) and its applicability in the SGML domain.The capability of this query language to express complex queries with a not-so-complex syntax gives relational databases that support SQL an advantage over other similar systems. The ability to use SQL or an SQL-like query language with SGML has the potential of giving much more power to SGML repositories. This article shows how we can pose complex document-related questions easily with SQL. SQL-capable systems will let you solve problems that would otherwise seem impossible, or at least, tedious."
The author believes that SQL ought to be implemented more completely in SGML systems, as it supplies a widely accepted and powerful language for expressing queries -- many of which are difficult to express in current SGML systems.
Available online in postscript format; [mirror copy]
Sengupta, Arijit. Design and Implementation of a Database Environment for the Manipulation of Structured Documents 1993.. Extent: 30 references.
"Abstract: A method for implementing a structured document database system is presented. The present-day systems dealing with structured or tagged documents have not been able to produce capabilities that even simple database systems possess - the ability to query the database based on the various properties of the database. Research in this area also has not been able to produce query languages and visual query interfaces similar to those that exist in the relational domain. The goal for the present research is to develop a complete database system for structured documents having data definition, manipulation and querying capabilities similar to those in the relational world. Only structured documents tagged with the SGML [13] have been considered, in which detailed and complete information about the document structuring can be obtained from the Document Type Definition (DTD). Special systems that have been considered, used and evaluated are PAT (Open Text 5.0) [22], sgmls 1.1, Exodus [28], Shore [6] for purposes of data structures, parsing, data storage and retrieval, etc. Special considerations have been given to three special cases of data for experimentation purposes: (a) the Oxford English Dictionary (OED) database, (b) the Chadwyck Healy English Poetry full-text database, and (c) an experimental movie database." [from the online text]
Available online at URL http://www.cs.indiana.edu/hyplan/asengupt/thesis/oral/oral.html. [further details are requested from the author]
[CR: 19961226]
Sengupta, Arijit. "Standardizing the Querying Process with SGML The SQL DTD." Pages 323-338 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Indiana University, Computer Science Department, Lindley Hall 215, Bloomington, IN 47405, USA; Tel: 812 855 4318; Fax: 812 855 4829; Email: asengupt@indiana.edu; WWW: http://www.cs.indiana.edu/hyplan/asengupt.html.
Abstract: "One of the most exciting applications of SGML which has emerged in the recent years is its use in document databases. The structural information embedded in SGML documents makes it possible to query SGML documents and extract information in an automatic manner; however, this querying process has not been standardized. As a result, different SGML database implementations use their own query language syntax, thus making the migration from one system to another a difficult process. In the relational database domains, however, the query language SQL has been a standard for over ten years and is universally used in most relational database systems. Although originally designed for relational databases, SQL is quite powerful for specifying complex queries in a relatively easy-to-understand syntax. With a small set of extensions to take advantage of the hierarchical structure of SGML, SQL can be easily adapted for use with SGML document databases (TAG-496).
The powerful 'generalized' nature of SGML makes it easy to implement SQL as an SGML DTD, so that queries can be expressed as document instances of the SQL DTD. Current SGML authors and users can write queries expressed in this DTD without learning a different language or using a separate editor. Moreover, because of the portable nature of SGML, these queries can be used in any SGML database system and can be converted to regular SQL for use in a relational or Object-Relational/Object-Oriented database system, if necessary. Databases that support the SQL DTD can also store the queries without any extra effort, and subsequently query them for inferring optimization parameters.
This paper presents a representative DTD for the SQL query language, with extensions for use with hierarchically structured documents. It also compares this language with languages proposed and implemented, including SDQL - the query language in the DSSSL standard (DSSSL95). This paper explains the advantages of using this language as a query language in document database systems and the necessity for standardizing the querying process in document databases. Finally, it discusses some implementation issues and complexity measures."
Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Available in postscript format, or SGML format; [mirror copy, postscript]
[CR: 19970817]
Sengupta, Arijit; Dillon, Andrew. "Extending SGML to Accommodate Database Functions: A Methodological Overview." Pages 629-637 (with 27 references) in Structured Information/Standards for Document Architectures. Edited by Elisabeth Logan and Marvin Pollard. = Journal of the American Society for Information Science, Special Issue. Volume 48, Number 7 (July 1997). New York: John Wiley & Sons Inc., 1997. ISSN: 0002-8231. Authors' affiliation: [Sengupta:] Computer Science, Indiana University, Bloomington, IN; Email: asengupt@indiana.edu, WWW: http://www.cs.indiana.edu/hyplan/asengupt.html; [Dillon:] School of Library and Information Science, Indiana University, Bloomington, IN; Email: adillon@indiana.edu, WWW: http://www-slis.lib.indiana.edu/adillon/adillon.html.
Abstract: "A method for augmenting an SGML document repository with database functionality is presented. SGML (ISO 8879,1986) has been widely accepted as a standard language for writing text with added structural information that gives the text greater applicability. Recently there has been a trend to use this structural information as meta-data in databases. The complex structure of documents, however, makes it difficult to directly map the structural information in documents to database structures. In particular, the flat nature of relational databases makes it extremely difficult to model documents that are inherently hierarchical in nature. Consequently, documents are modeled in object-oriented databases (Abiteboul, Cluet, & Milo, 1993), and object-relational databases (Hoist, 1995), in which SGML documents are mapped into the corresponding database models and are later reconstructed as necessary. However, this mapping strategy is not natural and can potentially cause loss of information in the original SGML documents. Moreover, interfaces for building queries for current document databases are mostly built on form-based query techniques and do not use the 'look and feel' of the documents. This article introduces an implementation method for a complex-object modeling technique specifically for SGML documents and describes interface techniques tailored for text databases. Some of the concepts for a Structured Document Database Management System (SDDBMS) specifically designed for SIL documents are described. A small survey of some current products is also presented to demonstrate the need for such a system."
A Postscript version of the article is available online (also, online abstract); [local archive copy].
See the main document entry for the complete list of articles and contributors, as well as other bibliographic information.
[CR: 19970627]
Sengupta, Arijit Dillon, Andrew. Query By Templates: A Generalized Approach for Visual Query Formulation for Text Dominated Databases. Technical Report. To appear in the Proceedings of the Conference on Advanced Digital Libraries (ADL'97), Library of Congress, Washington, D.C. May 7-9 1997. []: [], May 1997. Extent: approximately 13 pages.
"Abstract: With the advent of the World Wide Web (WWW), the concept of document databases is becoming more popular. This makes the idea of a globally distributed digital document library realizable. The standard encoding format for the WWW is HTML (HyperText Markup Language), which embeds some structural information in otherwise text-dominated documents. HTML can be viewed as a special instance of SGML (Standard Generalized Markup Language), a very powerful document encoding language capable of describing may different types of languages and formats. The current work is based on designing query languages, processing and visualizing mechanisms for structured documents in general, and SGML documents in particular. We are using the World Wide Web as a platform for this querying mechanism, especially because of its popularity and world-wide availability. However, because of the wide range of users, these systems need to be easy to use. In particular, it is important that users can easily search for information from the database without prior knowledge of the internal structure of the database. This paper outlines a visual query constructing technique for application in databases containing hierarchically structured documents. In this paper, we describe the visual component of this query language, which is essentially a generalization of the Query By Example (QBE) language for relational databases. We call this method ``Query By Templates(QBT)''. Further, we describe the basic properties and usefulness this visual query technique, and show how queries on structured document databases can be performed using this method. We also describe an implementation of QBT on the Web using the Java{TM} programming language."
Available online in postscript format; [mirror copy].
[CR: 19951122]
Severson, Eric. The Art of SGML Conversion: Eating Your Vegetables and Enjoying Dessert. Avalanche Development Corporation/Interleaf, January 1995. 34K (computer file), ca. 15 pages. Author's affiliation: Executive Vice President, Avalanche Development Corporation [Interleaf]; email: eric@avalanche.com; Tel: (303) 449-5032.
"SGML conversions have a reputation for being worthwhile but not necessarily lots of fun. Much like the problem of having to eat your vegetables before you get dessert.
"SGML conversion typically involves building a bridge between the world of hardcopy and word processing documents (where logical structure is perceived visually by the reader) and "intelligent" documents (where logical structure is explicitly encoded). The whole point of SGML conversions is that they necessarily involve information enrichment, adding more than was originally there.
"This white paper explores the issues involved in moving to SGML and offers advice for making the process as effective and painless as possible. It demonstrates how the steps in the SGML conversion process are directly related to the benefits you get once conversion is complete."
Available on the Internet from the Interleaf/Avalanche WWW server: "The Art of SGML Conversion" [mirror copy November 1995]. Apparently also to be available as an SGML Open White Paper, #4001-II.
Severson, Eric. How SGML and HTML Really Fit Together: A Case for the A Scalable HTML Avalanche Development Corporation/Interleaf, January 1995. 24K (computer file), ca. 8 pages. Author's affiliation: Avalanche Development Corporation/Interleaf; email: eric@avalanche.com; Tel: (303) 449-5032.
**Note: Version 2 (April 1995) is available from this WWW server.
This (white) paper was distributed on Newswire, and is available as item 143.1995-01-09 in the Newswire archives, or here. Discussion of the paper took place on the sgml-internet discussion list.
[CR: 19971227]
Severson, Eric. "The Proper Role of SGML and XML in an Enterprise I/T and Intranet Strategy." Pages 513-518 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Eric Severson]: IBM Global Services.
Abstract: "Up to now SGML has tended to be used primarily in technical publishing applications, usually at a departmental level. However, with today's focus on web-based enterprise information management, and the recent introduction of XML, many more opportunities for SGML have become apparent. This whitepaper surveys the current state of the information industry, from both a business and technical point of view, and shows how SGML and XML technology can and should be positioned within an organization's overall I/T and intranet strategy."
This paper was delivered as part of the "Business Management" track in the SGML/XML '97 Conference.
Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).
[CR: 19960904]
Severson, Eric; Bingham, Harvey (editors). Table Interoperability: Issues for the CALS Table Model. SGML Open Technical Research Paper 9501:1995. Coraopolis, PA: SGML Open, November 21 1995. Extent: approximately 25 pages. Authors' affiliation: [Eric Severson] Co-chair, Table Interchange Subcommittee, SGML Open; [Harvey Bingham] Co-Chair, Table Interchange Subcommittee, SGML Open; also: Interleaf.
"Abstract: To help address the existing interoperability issues when using tabular material ("tables") in SGML implementations, SGML Open's Technical Committee formed a Table Interchange subcommittee to research these issues.
"Because the CALS table model has proliferated widely, it was chosen as the initial starting point. Although it has evolved to the point of a de facto standard, the specification leaves a large number of semantics open to interpretation which in turn has made interoperability difficult to achieve. As its first major task, the Committee therefore set out to identify and document ambiguities in the CALS table model specifications, identify and document related interoperability issues between SGML Open vendor products, and lay the groundwork for developing a proposed clarification of the standard that will minimize ambiguity and maximize interoperability."
"This paper summarizes the results of this initial work, identifies the sources of current interoperability issues for the CALS model, and summarizes the most common set of practices currently followed by SGML Open vendors."
Available in HTML format: SGML Open - TRP 9501:1995 - "TABLE INTEROPERABILITY: Issues for the CALS Table Model" [mirror copy, December 28, 1995]. Also available from the FTP server at Exoterica Corporation in compressed Postscript format ftp://ftp.exoterica.com/sgmlopen/9501/9501.ps.Z, [mirror copy] or in other formats (files: 9501pack.tar.Z, 9501pack.zip, 9501ps.zip). Document revisions: Technical Research Paper 9501:1995; Committee Draft: 1995 May 10; Committee Draft: 1995 August 5; Final Draft Technical Research Paper: 1995 September 15; Final Technical Research Paper: 1995 November 21.
[CR: 1995]
Sévigny, Martin. Conception et réalisation d'une interface-utilisateurs pour l'interrogation de bases de documents structurés. . Travail dirigé présenté à la Faculté des études supérieures de l'Université de Montréal. Quebec: Faculté des études supérieures de l'Université de Montréal,, 1996. Advisor: Yves Marcoux. Affiliation: École de bibliothéconomie et des sciences de l'information (EBSI), de l'Université de Montréal, Québec, Canada. .
See the summary for the thesis in French: Conception et réalisation d'une interface-utilisateurs pour l'interrogation de bases de documents structurés. Travail dirigé présenté à la Faculté des études supérieures de l'Université de Montréal, par Martin Sévigny. Pour l'obtention de la Maîtrise en bibliothéconomie et en sciences de l'information de l'EBSI.
[CR: 19970531]
Sévigny, Martin; Marcoux, Yves. "Conception et réalisation d'une interface-utilisateurs pour l'interrogation de bases de documents structurés [The Creation and Evaluation of a Human-computer Interface for Information Retrieval in Structured-document Bases", in French]." Revue canadienne des sciences de l'information et de bibliothéconomie [Canadian Journal of Information and Library Science] 21/3-4 (September-December 1996) 59-77 (with 23 references). ISSN: . Authors' affiliation: École de bibliothéconomie et des sciences de l'information (EBSI), de l'Université de Montréal, Québec, Canada. WWW [http://www3.sympatico.ca/msevigny/]: ; WWW [Marcoux]: http://tornade.ere.umontreal.ca/~marcoux/.
Abstract: "The creation of electronic information in the form of structured documents is steadily gaining popularity. It is thus necessary to develop information retrieval tools fitted to this type of document. In this article, we present the results of a research project aimed at identifying human-computer interface elements that can support information retrieval in structured document bases. The research included a review of the literature and of existing systems, as well as the design, development, and user testing of a prototype information retrieval system for SGML (ISO 8879) document bases. We make five recommendations for the design of structured-retrieval systems."
See the summary for the thesis in French: Conception et réalisation d'une interface-utilisateurs pour l'interrogation de bases de documents structurés. Travail dirigé présenté à la Faculté des études supérieures de l'Université de Montréal, par Martin Sévigny. Pour l'obtention de la Maîtrise en bibliothéconomie et en sciences de l'information de l'EBSI.
[CR: 19960313]
Seybold, Jonathan. "Remembering Yuri Rubinsky (1952-1996) [In memoriam: Yuri Rubinsky]." Seybold Report on Publishing Systems 25/11 February 29, 1996 20. ISSN: 0736-7260.
The full-page article includes a photograph of Yuri Rubinsky. He was familiar to Seybold readers mostly in connection with role as cofounder of SoftQuad, Inc. The article summarizes the highlights of Yuri's career in the publishing industry, including annotation of his major publications. It also includes eulogy from several friends and colleagues (Jonathan Seybold, Charles Goldfarb, Tim Berners-Lee, Chet Ensign, Tim Bray, Tommie Usdin, Elaine Brennan, and Pam Gennusa. For other memorial tributes to Yuri, see the collection elsewhere in this database.
[CR: 19971008]
[Seybold Publications Staff]. "Inso Adds Math to DynaWeb. IEEE Uses it to Go Live with Online Digital Library." Seybold Report on Internet Publishing 2/2 (October 1997) 28. ISSN: 1090-4808.
"Inso recently beefed up the math support in DynaWeb, its SGML-to-Web publishing system, by enabling Web browsers to display mathematical equations stored in DVI (TEX) format."
Representing and rendering math has always been a special challenge. The DynaWeb technology described in the article is featured in the IEEE (Institute of Electrical and Electronics Engineers, Inc.) Computer Society Digital Library (CSDL). The online database introduction explains: "For those interested in the publication technology, we have created a database of SGML files and linked images. These files are converted and displayed as HTML on the fly. This allows subscribers to manipulate and view the content -- including math -- with standard web browsers without any helper applications or plug-ins." According to the Seybold article, the online IEEE collection is an "SGML-encoded text library [which] offers the equivalent of 35,000 periodical pages and more than 250,000 images. With the new [DynaBase] 3.1 software, the Web database automatically converts DVI math equations into GIF images on the fly as articles are served to visitors...DynaWeb is the first commercial product to generate GIFs from DVI math equations on the fly."
Note: DynaWeb has been chosen by SOFTBANK NetForums as part of a dynamic Web
publishing solution. See the press release, [archive copy].
[CR: 19971008]
[Seybold Publications Staff]. "Microsoft, Inso, ArborText Propose Style Sheet Language for XML." Seybold Report on Internet Publishing 2/2 (October 1997) 19. ISSN: 1090-4808.
The article describes the proposal for a style sheet language XSL (Extensible Style Language), comparing it to CSS and DSSSL. See the database entry for Extensible Stylesheet Language (XSL) for further description.
[CR: 19970329]
[Seybold Publications Staff]. "Netscape Replies on XML [editorial]." Seybold Report on Internet Publishing 1/5 (January 1997) 2. ISSN: 1090-4808.
The article summarizes a response from Netscape Communications Corporation to the Seybold offices regarding Netscape's disposition on XML (Extensible Markup Language). Netscape clarifies that it is not currently [ca. January 1997] working on SGML or XML for two reasons: (a) customers have spoken in favor of HTML over SGML, so Netscape believes these customers are better served by improving HTML functionality than by adopting XML; (b) XML can theoretically be supported indirectly via XML-compliant Netscape plugins: "...it is possible and quite straightforward to incorporate SGML-based layout engines into the Netscape Navigator and Netscape Communicator environments using inline plug-ins or Java. This makes it possible for the 40 million users of Netscape Navigator to access XML information today."
According to the article, Alex Edelstein, Netscape's Group Product Manager, also expressed the opinion that the need to deal with user-defined tags and styles (as XML does) could be met via JavaScript: "...JavaScript is rapidly being accepted as a great means to provide user-defined flexibility. User-defined variables can be created dynamically and passed between client and server in this way."
Editorially, the article expresses doubt as to whether Netscape is "up to speed on publishing issues," at least, with respect to Seybold's readership: "that product managers at Netscape think HTML somehow will work as a way to define all the Web documents that we need, or that SGML's main benefit is better layout, illustrates just how far behind Netscape will be should Microsoft decide to leverage its expertise in SGML and XML."
See more on the Extensible Markup Language in the SGML/XML Web Page main entry.
[CR: 19971121]
[Seybold Publications Staff]. "Seybold San Francisco '97: PDF and XML Emerge. [Alternate title: 'Shaping the Future: PDF, XML and the Men of the Hour, Gates and Jobs'." Seybold Report on Publishing Systems 27/5 (November 17, 1997) 1, 3-38. ISSN: 0736-7260.
"By most calculations, the two areas of sharpest focus at Seybold San Francisco 97 were PDF, which increasingly is moving into a role as the format for production workflow, and XML, which is picking up support as a standard for tagging documents intended for use in multiple-media environments. Sections in this issue of Seybold Report on Publishing Systems covering SGML/XML include: "Asset Management, SGML and Database Publishing" (pages 29-38), beginning with: "The boundaries between asset management and document management are starting to blur. So are the boundaries between SGML publishing tools and database-publishing tools." The section "Publishing with SGML" (pages 33-36) provides an update on ArborText's Willow and XML support, Chrystal's Astoria 3.0, Datalogics' Documentum-to-Frame solution, I4I S4-Desktop, PIT's Target 2000, and XyVision's support for FrameMaker+SGML.
[CR: 19971120]
[Seybold Publications Staff]. "XML, Collaborative Tools Shine at Seybold San Francisco '97 [alt. title: 'XML, Content Management Take Center Stage at SSF '97']." Seybold Report on Internet Publishing 2/3 (November 1997) 1, 3-19. ISSN: 1090-4808.
Abstract: This "Special Report" feature article in Seybold Report on Internet Publishing describes the rapidly-changing world of Internet tools and standards that bear on Web publishing, and particularly, the role of XML within the W3C's suite of Internet recommendations. "Seybold San Francisco '97 will be remembered as the first major conference and trade show where XML entered the mainstream vocabulary. It was the buzz of the conference and a draw on the show floor. The demo of XML support in Internet Explorer 4 was one of the highlights of Bill Gates's keynote address on Wednesday."
[CR: 19971120]
[Seybold Publications Staff]. "XML Comes into the Limelight." Seybold Report on Internet Publishing 2/3 (November 1997) 4-5. ISSN: 1090-4808.
Summary: The article describes the growing support for XML, as evidenced in the SSF presentations by Bill Gates (Microsoft), John Warnock (Adobe), and John Gage (Sun Microsystems). Bill Gates is is quoted as saying "XML is important because you won't be able to afford to author for all of the screen form factors and interface techniques." Steve DeRose (chief scientist at Inso) is quoted as saying that the 'quiet revolution' (SGML now emerging in XML) "is no longer quiet, but boisterous, productive, and growing at Web speed."
This is a subsumed article in the longer feature article of the Special Report in this issue: "XML, Collaborative Tools Shine at Seybold San Francisco '97."
[CR: 19961213]
[Seybold Staff]. "W3C Publishes Draft of Simplified SGML. XML Allows User-definable Tags." Seybold Report on Publishing Systems 26/6 (November 30, 1996) 41. ISSN: 0736-7260.
"On the tenth anniversary of the adoption of SGML as an ISO standard, a band of SGML experts announced they have drafted a simplified subset of the language they hope will spur the use of SGML on the Internet. The new language, Extensible Markup Language, or XML, was prepared by a World Wide Web Consortium working group consisting of about 80 members, primarily representing vendors. The announcement was made at SGML '96, being held in Boston this week. The first published draft is available on the Web at http://www.w3.org/pub/WWW/TR/WD-xml-961114.html. XML, like SGML, is a meta-language for describing the markup of different types of documents. It is simpler than SGML, reducing a 500-page reference to 26 pages. Unlike HTML, which has a fixed (albeit changing) set of tags, XML lets you define your own tags and attributes." [extracted] See the main entry for XML in the SGML/XML Web Page for additional information.
[CR: 19961113]
Seybold Publications. Seybold San Francisco '96. Part III: Color Publishing, Page Composition and Hardware. Seybold Special Report, Volume 5, Number 4. Media, PA: Seybold Publications, October 28, 1996. ISSN: 1069-7217.
The Seybold San Francisco '96 Conference was held at the Moscone Convention Center, San Francisco, September 13-17, 1996. The Seybold Special Report Series (3 parts) covered the conference. Part III of the Seybold Special Report 5/4 includes a section entitled "Page Layout Software, SGML Systems and Other Aids to Publishing (pages 31-39). Featured SGML software systems include: (1) "Document management for ArborText Adept" [The 'Willow Initiative' which places software between the editor and the document manager for the purpose of managing small document 'objects']; (2) "Autographics tackles library automation"; (3) "I4I offers on-the-fly SGML concersion" [delivering SGML documents to users lacking SGML software]; (4) Microstar pursues 'Mainstream SGML'" [marketing initiative with Documentum, InfoAccess and Adobe to help SGML penetrate business environments by "making it simple for authors to create and maintain SGML-savvy documents"]; (5) "Passage Systems shows custom system" [PassageNet]; (6) Xyvision sees market in telecommunication" [TEDD and TIM DTDs]. In Parts I and II of the SSF '96 report, coverage is given to HTML/SGML products for Web publishing (SoftQuad HIP and HoTMetaL; Electronic Book Technologies' DynaText (Matterhorn), DynaBase, and DynaWeb 3.0); the other volume titles are: Part I: Overview of the Show and Publishing on the Internet, and Part II: Output Technology and Workflow Developments.
Seybold Publications. Seybold Seminars Boston '95 [March 28-31, 1995. Hynes Convention Center, Boston, MA]. Part I: Electronic Delivery, SGML Issues, Catalogs and Output. Seybold Special Report, Volume 3, Number 8. Media, PA: Seybold Publications, April 21, 1995. ISSN: 1069-7217.
SGML was a major theme at Seybold Seminars once again, and details are available in the two Special Report issues. Part II is less relevant, being dedicated to images and color (Volume 3, Number 9: Seybold Seminars Boston '95 [March 28-31, 1995. Hynes Convention Center, Boston, MA]. Part II: Managing Color; Image Input, Editing and Output; Page Makeup, Etc.). The issue title for Part I includes "SGML", which is becoming more popular in light of widespread acquaintance with HTML. The volume Table of Contents for Part I (much abbreviated) is: I. Introduction. Electronic Publishing: Moving Past Fear and Greed to Commercial Realities (3-7); II. Publishing on the Internet: Strategies and Tools (8-28); A. Tools for Creating Web Pages; B. Other Electronic Delivery Tools [EBT Deals with Phoenix]; III. New Tools for Managing and Writing SGML Documents (29-39); A. SGML-Based Document Management Tools; B. SGML Authoring Tools; IV. Catalog Production Systems (40-43); V. Imagesetters, Platesetters and Digital Presses (44-66).
The section "Tools for Creating Web Pages" includes the following major presentations: (1) InContext's Spider [Web authoring program based upon InContext 2 SGML editor]; (2) NaviSoft [HTML and Webs authoring]; (3) Archetype [HTML viewer supporting multiple views]; (4) SoftPress' Uniqorn; (5) SoftQuad HoTMetaL [support for HTML 3.0] and SoftQuad Panorama [SGML browsing over the Internet]; (6) Electronic Book Technologies [DynaBase, SGML to HTML conversion on the fly]; (7) Open Text [WWW Indexing].
The section "Other Electronic Delivery Tools" includes a significant story under the title "EBT Deals With Phoenix" (pages 21-23; see also the graphic of the virtual digital library on page 7). Phoenix Publishing Systems Inc., a spinoff company from Phoenix Technologies, which produces documentation for some 40% of PC shipped worldwide, has contracted with EBT [Electronic Book Technologies] to create online virtual "digital libraries" storing PC documentation. Virdox (the virtual documentation information system) supports advanced concepts in document versioning, including multilingual versions and multi-vendor versions. Other stories under "Other Electronic Delivery Tools" are: (1) "Frame Adds Olias [SGML browswe], drops R&D"; (2) "Ntergaid's HyperWriter 4.2" [with SGML import facilities]; (3) "Open Text Latitude for delivery, retrieval" [Release 5 of PAT used for managing the display of a broad range of text formats, incorporating Panorama for SGML display].
Under "SGML-Based Document Management Tools" this Seybold Special Report includes description, evaluation, and references for the following products: Auto-Graphics [Smart Editor]; Berger-Levrault/AIS [SGML/Store]; CTMG (or Active Systems) [ActiveSearch]; Documentum [Enterprise Document Management System]; EBT - Electronic Book Technologies [DynaBase]; Ferntree [Structured Information Manager]; Frame [Frame SGML Toolkit]; InfoDesign [WorkSmart]; IDI - Information Dimensions [Basis SGMLServer]; Interleaf [Relational Document Manager]; Odesta [LiveLink]; Texcel [Information Manager]; XSoft [Astoria]; XyVision [Parlance Document Manager].
The section "SGML Authoring Tools" includes reviews of four products in particular: (1) ArborText plans Internet Addition; (2) Frame improves style, attributes handling; (3) Microstar improves on [SGML Author for] Word; (4) WordPerfect prepares SGML Edition.
Seybold Publications. Seybold San Francisco '94 [September 13-16, 1994]. Part I: Electronic Document Delivery and Output Issues. Seybold Special Report, Volume 3, Number 2. Media, PA: Seybold Publications, October 10, 1994. ISSN: 1069-7217.
The abbreviated Table of Contents: Introduction: Publishing on the Net Sparks Industry Resurgence (1-7); Electronic Document Delivery (8-29); Internet and Online Publishing (10-15); Tools for Internet Publishing (16-20); Fonts for Electronic Documents (20-21); Delivering Documents Through Digital Media (22-27); Digital Ad Delivery: Ready to Move Ahead (27-29); Output Issues (30-67).
The issue contains an in-depth discussion of the implication of HTML for the advance of SGML. There is a short presentation "HTML: Becoming an SGML Application" (14-15). SGML tools for the Web are treated in discussion of three products: "EBT's DynaWeb Server" (16-17); "HaL Browser Shows SGML, HTML" (17); "IDI Adds Web Service to BasisPlus" (17-18). HTML authoring tools are treated in: "Tools for Making HTML" [Nice TagWizard; SoftQuad HoTMetaL; Free Tools] (20).
The section "Software for Delivering Document Collections" (24-27) includes discussion of SGML's role in sevral products: Bellcore's SuperBook [SGML import]; Folio support for SGML [SGML to flat-file conversion]; IBM's upgraded BookManager [migration to SGML support from underlying GML]; Sun Microsystems [replacing PostScript-based Answerbook documentation reader with SGML-based documentation using Electronic Book Technologies' DynaText and developer toolkit].
Seybold Publications. Seybold San Francisco '94 [September 13-16, 1994]. Part III: Composition, Font Issues, Platforms, SGML and Other Topics. Seybold Special Report, Volume 3, Number 4. Media, PA: Seybold Publications, October 31, 1994. ISSN: 1069-7217.
The abbreviated Table of Contents: Introduction: Text Composition, Page Layout, Font Issues, and Newspaper Systems (3); Composition Systems and Software (4-12); Newspapers and Magazines (13-20); Xtensions and Additions (21-25); Fonts: New Technology Keeps the Fires Burning (26-30)
SGML Coming Into the Mainstream (31-37); SGML Tools: Microsoft Into the Act (32-35); Other Authoring Tools (35-38); Other Document Conversion Services and Tools (37); The Great Platform Debate Continues (38-50).
The special coverage of SGML publishing tools (pages 31-37) includes a major discuss of Microsoft's SGML Author for Word, including companion products for Author. The companion products include Avalanche SureStyle [conversion of text with direct formatting into SGML constructs, additional processing of tables, cross-refrences, OLE embedding information] and SoftQuad Enactor [cleans up SGML errors in Author, including support for SGML constructs not implemented in Author]. Other SGML products reviewed include: SoftQuad Enabler [SGML support for Quark Express], SoftQuad Explorer [SGML browser], SoftQuad HoTMetaL [HTML editor], ArborText's PowerPaste [SGML import facility] and Adept Electronic Review [SGML-based document review tools]; Auto-Graphics' Smart Editor version 5 [SGML-based editorial system]; Frame SGML Toolkit; Nice Technologies' TagWizard [Microsoft Word SGML tagging tool] and AIMS [IETM preparation software]. Also noted: DCL (Data Conversion Laboratories) product SGMLView [conversion tool] and Exoterica OmniMark 4.2 [SGML conversion/translation facilities].
[CR: 19950925]
Seybold Publications. Seybold Special Report. Show Preview: Seybold San Francisco. Seybold Report Editors Name Their Hot Picks. Seybold Special Report Volume 4, Number 1. Media, PA: Seybold Publications, September 26, 1995. ISSN: 1069-7217.
This issue of the Seybold Special Report is dedicated to a preview of publishing software (including SGML products) to be exhibited at the Seybold San Francisco 1995 show. Some SGML highlights include: ArborText (Adept Editor); Auto-Graphics (Smart Editorial System 5.3, and Impact (SGML document searching); Electronic Book Technologies (EBT) DynaWeb 1.0, figleaf, and WebTap (HotJava Applet); Exoterica OmmiMark release 2.5; FrameMaker +SGML; Infrastructures for Information (I4I), SGML DLL toolkit; Novell WordPerfect 6.1 for Windows, SGML Edition; Passage Systems' PassageHub (SGML conversion tool based upon Exoterica's Corporation's OmniMark) and PassagePro; XSoft Astoria; SoftQuad's SGML Enabler. See the main conference and exposition entry for further details, or the Seybold Publications main entry.
[CR: 19951209]
Seybold Publications. Seybold San Francisco '95, Part III. Part III: Color Workflow, Image Databases, Page Layout, SGML, Other Topics. Seybold Special Report, Volume 4, Number 4. Media, PA: Seybold Publications, November 10, 1995. ISSN: 1069-7217.
One section of this Special Report is dedicated to the SGML scene visible at SSF '95. The section is titled "Stability and Growth for SGML Market (pages 31-34). Subsections cover several new releases and announcements:
- Arbortext expands Windows authoring functionality: Adept smart tag insertion, user control over screen formatting, generation of HTML from SGML, support for more graphics data types; DTDs for HTML 2.0, Docbook 2.2.1, ATI Article, and ATI Book
- Auto-Graphics: records are not just for reference books: use of the SGML Smart Editorial System 5.3 for maintenance manuals
- FrameMaker+SGML nears release: better table handling
- I for I shows SGML services toolkit: SAS (SGML application server)
- Microstar completes Word add-on: Near & Far Author (Word 6.0 SGML add-on)
- Miles eases SGML composition: Genera composition facilities
- Passage Systems shows how to search SGML files on the Web: presentation of the SGML Search Engine and PassageHub (SGML filters)
- Xyvision readies new PDM version: version 2.3 of Parlance Document Manager to debut at the November Documation conference
[CR: 19960409]
Seybold Publications. Seybold Seminars Boston '96 [February 27 - March 1, 1996. Boston, MA]. Part I. Seybold Seminars Boston '96: When Worlds Collide.. Seybold Special Report, Volume 4, Number 8. Media, PA: Seybold Publications, March 25, 1996. ISSN: 1069-7217.
This issue of the Seybold Special Report contains part 1 of three parts covering Seybold Seminars Boston '96: "State of the Industry, Iinternet Publishing, and Color Output." Several articles and notices in the report provide updates on SGML products and publishing trends that are impacted by SGML. Samples: "Seybold Editors' Awards: Electronic Book Technologies, for DynaWeb" (p. 8); "EBT Shows Netscape Plug-ins," (p. 18); "Jouve Releases GTI Publisher" (p. 20).
[CR: 19970726]
[Seybold Publications Staff]. "Grif Commits to XML Editing Tool." Seybold Report on Internet Publishing 1/9 (May 1997) 37. ISSN: 1090-4808.
[Summary] "...Grif has stepped forward as the first vendor to commit to developing an XML authoring tool. It will be receiving help from Cadmus, one of the largest U.S. suppliers of services to journal publishers. . .The early adoption of XML is a further indication that Grif intends to leverage its SGML expertise in the wider market of Web authoring. Grif also announced plans to open an office in Boston, its first in the U.S. . .At the WWW '97 conference held last month in Santa Clara, CA, Grif announced its intention to adopt XML in its product line. It previewed XML support in both its SGML Editor and Symposia, its HTML authoring tool. Symposia, based on Grif's WYSIWG SGML Editor, provides both WYSIWYG and tag-based editing, with tag validation. It runs on both Windows and Unix platforms."
[CR: 19970815]
[Seybold Publications Staff]. "Web Publishing Systems Struggle for Identity. Seybold Seminars '97. Getting Down to Brass Tacks. SGML and the Web." Seybold Report on Internet Publishing 1/9 (May 1997) 23-24 [1-24]. ISSN: 1090-4808.
A comprehensive report on New York Seybold Seminars '97 is provided by Peter Dyson, Matt McKenzie, Victor Votsch, and Mark Walter. The article concludes with a section on SGML and the Web, covering products from Agave, Inso, and SoftQuad: "Agave links SGML and SQL", "Inso serves books through DynaWeb", and "SoftQuad: success with Panorama". Agave's SQml extensions to SGML are implemented in an SQml-CGI server, and link documents and legacy databases. Inso still has a large market share in the realm of serving electronic books (via DynaWeb), providing control down to the SGML element level. "The DynaWeb server is now client-aware, meaning it will serve html differently, depending on styles you set for different types of browsers. For example, DynaWeb can generate CSS style sheets for HTML documents on the fly. SoftQuad Inc. has released the SoftQuad Panorama Publishing System, and is marketing it for SGML-based document delivery on intranets. "SoftQuad reports installations at Hitachi, the U.S. Department of Defense and one of 35,000 seats at BellSouth. SoftQuad also recently got its Panorama Publishing System included on the U.S. Government's GSA schedule, making it easier for government agencies to
order the system."
[Summary:] "For those who have encoded their text in SGML the Web remains an economical output medium. New York was our first opportunity to see Agave, which marries SGML to relational databases. We also looked at new versions of two different, and established, display options: DynaWeb and Panorama."
[CR: 19960826]
SGML Open. "SGML in Education: The TEI and ICADD Initiatives." Computers in Libraries 16/3 (March 1996) 26-28.
"Abstract: SGML Open is a group promoting adoption of the Standard Generalized Markup Language for exchange of data and documents as the international standard. Two groups working in the academic field to adapt and use SGML are the Text Encoding Initiative and the International Committee for Accessible Document Design. TEI uses SGML to encode literary and historical texts and ICADD makes them accessible to blind researchers and other impaired students. Both initiatives are discussed."
[Another abstract: "SGML Open [http://www.sgmlopen.org/] is a consortium dedicated to promoting the use of SGML, an ISO standard for data encoding that enables value-added, reusable, platform-independent documents. This article highlights two international efforts which are using SGML. TEI (Text Encoding Initiative) provides guidelines for encoding literary and historical texts. The TEI guidelines are meant to be flexible and scalable, able to accommodate any body of text and delimit salient features with markup, adding intelligence and meaning. ICADD (International Committee for Accessible Document Design) focuses on making textbooks available in formats such as Braille, large print and voice synthesis. SGML encoding not only provides structured access to documents that could otherwise be unavailable, but also makes that access more democratic." [-- CJC Campbell Crabtree in Current Cites Volume 7, no. 4, April 1996, published by The Library, University of California, Berkeley; ISSN: 1060-2356.]
[CR: 19960310]
The SGML University Board of Regents. SGML Power Tools. Net-Virtual Location in Cyberspace [probably Denver, Colorado or Rochester, New York]: SGML University Press, 1995. ISBN: 0-9649602-0-6.
Abstract: "A CD-ROM full of information, applications, software demonstrations, and other resources needed to get started using SGML. The top companies in the SGML industry provide information about their products. Some have included demonstration software or outright free software on the disc. Also, the world's first Multimedia SGML Tutorial (number one in a series of five) is on the disc."
"SGML University is making this disc free for legitimate users who need to know more about SGML. To get your free copy, send an e-mail mesage telling us about your interest in SGML. Be sure to include your address and phone number. Qualified respondents will receive the disc immediately."
See further information on the SGML University WWW page.
International SGML Users' Group. "A Brief History of the Development of SGML." 3 June 1989. 2 pages.
The publication is available from the SGMLUG office as a separate document, and is printed in the SGML Users' Group Newsletter 14 (October 1989) 6-7. Being free of copyright restrictions, it it also published elsewhere: (1) The SGML Handbook, cited here, Appendix A: pp. 567-570; (2) The SGML Source Guide, also cited; (3) Joan Smith's Book on SGML and Related Standards, Appendix 1.
"SGML Open Establishes SGML/Internet Link." <TAG> 7/11 (November 1994) 5. ISSN: 1067-9197.
[CR: 19950716]
SGML Project (Exeter). What is SGML and Why Should I Use It? Exeter, UK: SGML Project, 1993. Extent: approximately 4 pages.
This brief document provides an excellent overview of SGML using non-technical language. It is available online from the University of Exeter WWW server: see the link to Exeter, or fetch the the document in mirror copy from the local server. It was probably written by Michael G. Popham and/or Paul A. Ellison, both of whom are to be praised for their fine work in administrating the SGML Project as long as funding was available to them. [Possible] contact: Paul A. Ellison, email P.A.Ellison@ex.ac.uk; Deputy Director, University of Exeter IT Services; Laver Building; North Park Road; Exeter EX4 4QE, UK; Tel: (+44) 1 392 263951; Fax: (+44) 1 392 211630.
"SGML Tips & Techniques: Using Noun and Verb Tags to Effect Proper Hyphenation." <TAG> 8/3 (March 1995) 10. ISSN: 1067-9197.
[Bibliographer's note:] The article is significant from the academic point of view in that the "tip" and its motivation arose (apparently) within the business sector -- not, as one might have guessed, within the TEI. How do we achieve proper (automatic) hyphenation when the hyphenation rule depends upon the word's part-of-speech attribute? The author suggests that for homographs like English word "project" we might use the following kind of SGML tagging (e.g., when the word is a noun and not a verb): <word type='noun'>project</word>. According to the tip's proposal "The SGML application can then take the words inside the WORD tag and pass appropriate instructions to the composition engine for proper hyphenation." In the case of the verb, we want 'pro-ject' while the noun is to be hyphenated 'proj-ect'. Whatever the merits of the proposal as a practical solution, it highlights an observation that has been made frequently within the segment of the academic community that has seen the value of SGML: text processing will fail if it treats textual data as a String rather than as Text Objects with linguistic attributes. From an Object perspective, the markup is correct in two ways: it delimits and names the text object ("word"), and it describes a real feature of the object in context ("noun"): knowledge from another domain is brought to bear when the word-noun needs to be hyphenated. This high-level strategy, from an Object point of view, represents an improvement over the procedural encoding that one finds in some word processing systems: control characters or other special characters to encode discretionary hyphen. Of course, SGML markup is just one way to represent information about text objects like "words" such that proper processing is effected.
[CR: 19960730]
Shafer, Keith. Creating DTDs via Fred. Paper presented at Digital Libraries Workshop 1996, Organized by Nancy Ide and Judith Klavans, Held in conjunction with the First ACM International Conference on Digital Libraries, Bethesda, Maryland. Poughkeepsie, New York / New York, NY: Vassar College, Department of Computer Science / Columbia University, Department of Information Services, 1996. Author's affiliation: OCLC Online Computer Library Center, Inc., 6565 Frantz Road, Dublin, Ohio 43017-3395. Email: shafer@oclc.org.
Abstract: "In this paper, we motivate and describe tools we have built to automatically create reduced structural representations of tagged text. These tools are novel in that they let one use the basic tenants of SGML without creating DTDs by hand." [Abstract]
"While the TEI Guidelines and corresponding DTD work provide a good framework from which to tag text, it is possible that the application of these guidelines may result in the creation of documents with no corresponding DTD. When this happens, mechanisms need to be in place to help them generate the appropriate DTDs. This does not imply that the TEI work is incomplete or non-extensible, only that it is difficult to provide a single framework (or set of DTDs) that covers all electronic sources. Many people now know how to tag documents and they may even follow the TEI Guidelines, but some will make mistakes or need to extend the model." [extracted]
The document is available online: ; [mirror copy]. See the main workshop entry or the program listing for other workshop details.
Shafer, Keith E. "SGML Grammar Structure." Annual Review of OCLC Research, July 1992 - June 1993 ? (1993) 39-40. Senior Research Scientist, Online Computer Library Center, Inc. (OCLC).
[CR: 19951207]
Shafer, Keith. Creating DTDs via the GB-Engine [General Grammer Builder] and Fred. Paper presented at SGML '95. Dublin, Ohio>: OCLC Online Computer Library Center, Inc., 1995. Extent: approximately 14 pages. Author's affiliation: OCLC.
"Abstract: In this paper, we motivate and describe tools we have built to automatically create reduced structural representations of tagged text. These tools are novel in that they let one use the basic tenants of SGML without creating DTDs by hand."
Available on the Internet: Creating DTDs (SGML '95) [mirror copy, December 1995. See the OCLC Fred entry or the OCLC Fred Home Page for other details.
Shafer, Keith E. "Manipulating Tagged Text." In Part 1: OCLC Project Reports, Annual Review of OCLC Research, 1994. Dublin, OH: OCLC Online Computer Library Center, 1995. approximately 3 pages. Author Affiliation: Senior Research Scientist, OCLC.
"Abstract: While the Standard Generalized Markup Language (SGML) is intended to offer freedom from vendor-dependent data, it is difficult to translate arbitrary SGML into multiple output formats. To address this problem, we have incorporated translation capabilities into the SGML Grammar Builder project."
Available via the Internet on the OCLC WWW server. [mirror copy, text only]
Shafer, Keith E. "Translating Mathematical Markup for Electronic Journals." In Part 1: OCLC Project Reports, Annual Review of OCLC Research, 1994. Dublin, OH: OCLC Online Computer Library Center, 1995. approximately 4 pages. Author Affiliation: Senior Research Scientist, OCLC.
"Abstract: While there is now an international standard for mathematical markup, no systems produce formatted documents from the complete standard. This report describes how mathematical markup is translated at OCLC."
"OCLC's Electronic Journals Online (EJO) provides a typeset quality display of journal articles via the Guidon document viewer. Guidon formats files coded in the TeX typesetting language to produce online pages. EJO accepts, however, source documents marked up via the Standard Generalized Markup Language (SGML); thus, source documents must be translated to TeX to produce displayable files. To facilitate this translation to TeX, we added translation capabilities to the SGML Grammar Builder interpreter, Fred. (See "Manipulating Tagged Text" for an overview of Fred's translation processes.) The goal of this project was to use Fred to translate the set of tagged structures that comprise the international standard for SGML mathematical markup (found in ISO 12083) to TeX for use in EJO." [extracted]
Available via the Internet on the OCLC WWW server. [mirror copy, text only]
[CR: 19970312]
Shepherd, Michael A.; Watters, Carolyn R.; Burkowski, Forbes J. "Digital Libraries for Electronic News." Pages 55-62 in Digital Ribraries: Research and Technology Advances. ADL '95 Forum. Selected Papers. Forum on Research and Technology Advances in Digital Libraries, ADL '95. McLean, Virginia, USA, May 15-17, 1995. Sponsored by NASA. Edited by Adam, Nabil R.; Bhargava, Bharat K.; Halem, Milton; Yesha, Yelena. Lecture Notes in Computer Science, volume 1082. Berlin/Heidelberg, Germany: Springer-Verlag, 1996. ISBN: 3-540-61410-9. ISSN: 0302-9743. Authors' affiliation: [Shepherd:] Department of Mathematics, Statistics, and Computer Science, Dalhousie University, Canada, Email shepherd@cs.dal.ca and Web: Dalhousie's multimedia news research - http://bcr2.uwaterloo.ca/dal/; [Watters:] Jodrey School of Computer Science, Acadia University, Canada, Email cwatters@dragon.acadiau.ca; [Burkowski:] Department of Computer Science, University of Waterloo, Canada, Email fjburkow@plg.uwaterloo.ca.
Discussion of electronic news evaluates the "semantic attributes of news items" specified in the Universal Text Format (UTF). This standard, which used SGML encoding, was established by industry bodies as NITF (the News Industry Text Format). The paper was presented in the conference as part of the session "Visualization in Digital Libraries." The document is available electronically on the Internet in Postscript format; [mirror copy]
[CR: 19980430]
Thompson, Henry S.; Anderson, A. H.; Bader, M. "Publishing a Spoken and Written Corpus on CD-ROM: The HCRC Map Task Experience." Pages 168-182 in Spoken English on Computer. Transcription, Mark-up, and Application. Edited by Geoffrey N. Leech, Greg Myers, and Jenny Thomas. New York, London, and Essex, England: Longman, 1995. ISBN: 0582250218.
For additional references and a more recent project description, see David McKelvie, Cris Drew, and Henry S. Thompson: "Using SGML as a Basis for Data-Intensive Natural Language Processing [NLP]." See also the database main entry: The HCRC Map Task Corpus.
[CR: 19960226]
Thompson, Henry S.; Finch, Steve; McKelvie, David. The Normalised SGML Library (NSL). LRE Project 62-050 MULTEXT. Workpackage 2. Milestone C, Deliverable NSL.. Edinburgh, Scotland: Human Communication Research Centre, November 14, 1995. Extent: 38 pages, 2 references. Author's affiliation: Human Communication Research Centre, Edinburgh..
Abstract: "This document describes the Normalised SGML Library (NSL), which consists of a set of C programs for manipulating SGML files and a C application program interface (API) designed to ease the writing of C programs which manipulate SGML documents."
From the author's notice: "In pursuit of a development environment for SGML-based corpus and document processing, with support for multiple versions and multiple levels of annotation, LTG have developed an integrated set of SGML tools and a developers tool-kit, including a C-based API. This software described here contains everything required to process a very wide range of conformant SGML documents. Its initial parsing module incorporates v1.0.1 of James Clark's SP software, arguably the broadest coverage SGML parser available anywhere, commercial or not.
"The basic architecture is one in which an arbitrary SGML document is processed on the way in, as it were, yielding two results: 1) An optimised representation of the information contained in the document's DOCTYPE; 2) A normalised version of the document instance, which can be piped through any tools built using our API for augmentation, extraction, etc. The use of the cached DOCTYPE together with the normalisation of the SGML to nSGML means that applications processing nSGML streams can be very efficient."
This work comes out of the MULTEXT project. See the links to NSL (documentation and distribution) from Henry Thompson's Home Page. Mirror copy (of November 14, 1995 version).
[CR: 19951207]
Shafer, Keith; Thompson, Roger. Introduction to Translating Tagged Text via the SGML Document Grammar Builder Engine. OCLC Technical Report. Dublin, Ohio>: OCLC Online Computer Library Center, Inc., 1995. Extent: approximately 18 pages. Authors' affiliation: OCLC, 6565 Frantz Road, Dublin, Ohio 43017-3395.
"Abstract: While the Standard Generalized Markup Language (SGML) promises freedom from proprietary data formats, it is still difficult to translate arbitrary SGML data to other formats. To address the SGML translation needs at OCLC, we have added translation capabilities to the SGML Document Grammar Builder programming engine. Several systems incorporate this programming engine, including our Fred interpreter. In this paper, we describe the general translation capabilities of this programming engine and relate it to Fred."
Available on the Internet: Introduction to Fred Translation [mirror copy, December 1995. See the OCLC Fred entry or the OCLC Fred Home Page for other details.
[CR: 19960110]
Shafer, Keith; Thompson, Roger. Translating Mathematical Markup for Electronic Documents. OCLC Technical Report. Presented at the WWW4 Conference, Boston, December 1995. Dublin, Ohio>: OCLC Online Computer Library Center, Inc., 1995. Extent: approximately 18 pages. Authors' affiliation: OCLC, 6565 Frantz Road, Dublin, Ohio 43017-3395.
"Abstract: In this paper, we describe a general translation tool that can transform tagged text into arbitrary output formats. Specifically, we describe how OCLC makes scientific documents containing mathematical markup available on the World Wide Web. The translation capabilities we developed to do this help realize the potential of the Standard Generalized Markup Language (SGML) to provide users with a single, non-proprietary document representation that can be translated on demand to other output formats. This enables publishers who target the WWW as a delivery medium to use the latest advances in HTML without constant revision of their document archives."
Available on the Internet: "Translating Mathematical Markup into HTML" [mirror copy, January 1996, or from: http://www.w3.org/pub/Conferences/WWW4/Papers/177/. See the OCLC Fred entry or the OCLC Fred Home Page for other details.
[CR: 19951113]
Shaw, Alan C. "Structure Editor Generators for Documents, Programs, and Other Structured Data." Pages 30-50 in Protext III. Proceedings of the Third International Conference on Text Processing Systems. International Conference on Text Processing Systems. Trinity College, Dublin. 22-34 October, 1986. Edited by J. J. H. Miller. Dublin, Ireland: Dún Laoghaire, Co., Boole Press Ltd., January 1987. ISBN: 0-906783-55-0 (hardback); 0-906783-56-9 (paperback).
Shaw, Alan C.; Furuta, Richard K.; Scofield, J. "Document Formatting Systems: Survey, Concepts and Issues." Pages 47-52 (with 20 references) in International Conference on Research and Trends in Document Preparation Systems. Abstracts of the Presented Papers. Conference on Research and Trends in Document Preparation Systems, Lausanne, Switzerland, February 27-28, 1981. Supported by the [Swiss] Conseil des Ecoles Polytechniques Fédérales, Organized by the Swiss Federal Institutes of Technology. J. D. Nicoud, Program Chair. Lausanne/Zürich: Swiss Federal Institutes of Technology, 1981. v + 130 pages. Authors' affilation: FR-35 University of Washington, Department of Computer Science, Seattle, WA USA 98195.
[CR: 1995]
Shaw, Elizabeth. OCR and SGML Mark-up of Documents from the Making of America Project. Report on a Directed Field Experience at Humanities Text Initiative. Humanities Text Initiative (HTI) Technical Report. Ann Arbor, MI: University of Michigan HTI, . Extent: approximately 12 pahes. Author's affiliation: University of Michigan.
Overview: "The purpose of this project has been to explore the feasibility and costs of doing an automated OCR (optical character recognition) conversion of scanned TIFF images for the Making of America Project and automating initial SGML mark-up of the documents. . .Using the automation that we have developed, we can process a CD-ROM with approximately 4,000 pages into roughly marked up documents with an average of less than 2 hours of human intervention per CD-ROM. Moving that unproofed rough markup to a finished valid SGML document takes an additional 2-3 minutes per page for mark-up and 8-9 minutes per page for proofing and entering corrections. Documents with significant differences (two column formats or a significant number of images) from the norm would take additional processing time. However initial analysis of the documents indicates that these anomalies are in the minority - ranging from 0 to 3 documents per CD-ROM. In addition, most of the two column documents are less than 30 pages in length."
For more information see the Making of America Project
Available online: http://dns.hti.umich.edu/htistaff/pubs/1997/ejshaw.01/: "OCR and SGML Mark-up of Documents from the Making of America Project. Report on a Directed Field Experience at Humanities Text Initiative." By Elizabeth Shaw, December, 1996; [mirror copy]
[CR: 19971024]
Shaw, Elizabeth J.; Blumson, Sarr. "Making of America. Online Searching and Page Presentation at the University of Michigan." D-Lib Magazine (July/August 1997). ISSN: 1082-9873. Authors' affiliation: Digital Library Project, University of Michigan.
Summary: "In this paper, we will describe the unique aspects of the first phase of the University of Michigan's implementation of the Making of America Project (http://www.umdl.umich.edu/moa/), a collaborative effort with Cornell University. Using "raw" uncorrected results of automated optical character recognition (OCR) of the page images, and SGML-encoding of the ensuing textual information in minimal Text Encoding Initiative (TEI) conformant markup, we can provide a searchable database of the roughly 650,000 page images that comprise our portion of the Making of America Project. We provide access to the page images on the Web without special viewing tools through a page delivery system that converts the requested pages from TIFF to GIF format on the fly. We will also describe how our approach will allow us to extend functionality as time and resources become available."
The article is available online in HTML format; local archive copy. Note that the July/August 1997 double issue of D-Lib Magazine (Amy Friedlander, editor) contains several articles referencing the use of SGML encoding in digital library research.
[CR: 19990414]
Dongwook Shin; Hyuncheol Jang; Honglan Jin. "BUS: An Effective Indexing and Retrieval Scheme in Structured Documents." Pages 235-243 (with 16 references) in Digital Libraries '98. Proceedings of the Third ACM Conference on Digital Libraries. Third ACM Conference on Digital Libraries. Pittsburgh, PA. June 23-26, 1998. Sponsored by ACM Siglink and SIGIR. Edited by Ian H. Witten, Rob Akscyn, amd Frank M. Shipman, III. New York, N.Y.: Association for Computing Machinery, 1998. ISBN: 0-89791-965-3. Authors' affiliation: Department of Computer Science, Chungnam National University, Taejon, South Korea. Email: shin@comeng.chungnam.ac.kr. Also [1999]: Visiting Scholar, Lister Hill National Center for Biomedical Communications .
Abstract: "In recent digital library systems or the World Wide Web environment, many documents are beginning to be provided in the structured format, tagged in mark up languages like SGML or XML. Hence, indexing and query evaluation of structured documents have been drawing attention since they enable to access and retrieve a certain part of documents easily. However, conventional information retrieval techniques do not scale up well in structured documents. This paper suggests an efficient indexing and query evaluation scheme for structured documents (named BUS) that minimizes the indexing overhead and guarantees fast query processing at any level in the document structure. The basic idea is that indexing is performed at the lowest level of the given structure and query evaluation computes the similarity at a higher level by accumulating the term frequencies at the lowest level in the bottom up way. The accumulators summing up the similarity play the role of accumulating all the term frequencies of the related part at a certain level. This paper also addresses the implementation of BUS and proves that BUS works correctly. In addition, along with several experiments, it shows that BUS facilitates efficient indexing in terms of space and time and guarantees the reasonable retrieval time in response to user queries."
"This paper proposes an indexing and query evaluation scheme (named BUS - Bottom Up Schenze) for structured documents that minimizes the indexing overhead and guarantees fast query response time. The basic idea is that indexing is performed at the leaf elements of the given structure and query evaluation computes the similarity at higher level by accumulating the weights at the lowest level in the bottom up way. It underlies the result of R. Wilkinson that 'the retrieval of whole documents can he carried out effectively using just their parts' in part and the idea of UID (Unique element IDentifier) that enables to compute ancestor element of a given element fast."
The Proceedings volume Table of Contents is available online. See also the main conference entry for Digital Libraries '98: Third ACM Conference on Digital Libraries. For an online version is available in Postscrit format; [local archive copy]. See: "BUS: An Effective Indexing and Retrieval Scheme in Structured Documents." Other references: BUS (Bottom Up Scheme) of indexing and retrieval for SGML/XML documents; "Looking for a partner in commercializing an efficient IR engine for SGML/XML data."
[CR: 19951221]
Shreve, Gregory M. "SGML Representation of Concept Systems -- Identifying, Tagging and Retrieving Term -- Concept Structures in Textual Context." Pages 157-168 (with 5 references) in Standardizing and Harmonizing Terminology: Theory and Practice, edited by S. E. Wright and Richard A. Strehlow. Philadelphia: American Society for Testing Materials [ASTM, Committee on Terminology], 1995. ISBN: 0803119844. 0066-0558 [ASTM Special Technical Publication, 0066-0558, volume 1223]. Author's affiliation: Kent State Univ, Kent, OH, USA.
"Abstract: The technical terminology used by the technical communicator or technical translator is encountered in texts. The terms in the texts are not randomly arranged but are used deliberately to invoke specific concept structures. SGML encoders and parsers can be used to identify and retrieve terminological structures in texts and help translators and terminologists better understand the relationship of terms in their textual context to abstract concept systems and knowledge organization." [abstract from author]
[CR: 19971106]
Siegel, David. "[Work in Progress: People & Projects.] The Web Is Ruined and I Ruined It." Pages 13-21 in XML: Principles, Tools, and Techniques. Guest Edited by Dan Connolly. World Wide Web Journal [edited by Rohit Khare] Volume 2, Issue 4. Sebastopol, CA: O'Reilly & Associates, Fall 1997. Extent: xxii + 248 pages. ISBN: 1-56592-349-9. ISSN: 1085-2301. Author's affiliation: Verso.
Summary: "In 'The Web Is Ruined and I Ruined It' self-proclaimed HTML Terrorist David Siegel discusses how proper separation of structure (HTML), style (CSS), and semantics (XML) makes content more compelling and design more effective."
A version of this document is available online in HTML format: http://webreview.com/97/04/11/feature/index.html; or http://xent.ics.uci.edu/FoRK-archive/spring97/0381.html [local archive copy, text only].
[CR: 19961226]
Simon, Sheila D. "How To Make Data Sharing Work." Pages 77-80 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Encyclopædia Britannica, 310 South Michigan Avenue, Chicago, Illinois 60604, U.S.A.; Tel: +1 (312) 347-7064; FAX: +1 (312) 294-2187; Email: ssimon@eb.com.
Abstract: "Making data sharing work in a publishing system is not as easy as it sounds. There is much to take into consideration. I plan on discussing key points and factors that will enable you to have a better understanding of the concept of sharing data. I will also discuss what things need to be considered in deciding whether or not to share data. Also, key components will be defined as what is needed to make sharing data successful. Real life experience implementing SGML database systems that have the capability of sharing data is the basis of the following discussion."
Note: The above presentation was part of the "SGML Newcomer" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
Simons, Gary F. "The Computational Complexity of Writing Systems." Pages 538-553 in The Fifteenth Lacus Forum 1988. Edited by Ruth M. Brend and David G. Lockwood. Lake Bluff, IL: Linguistic Association of Canada and the United States, 1989.
Abstract: In this article the author argues that computer systems, like their users, need to be multilingual. "We need computers, operating systems, and programs that can potentially work in any language and can simultaneously work with many language at the same time." The article proposes a conceptual framework for achieving this goal.
Section 1, "Establishing the baseline," focuses on the problem of graphic rendering and illustrates the range of phenomena which an adequate solution to computational rendering of writing systems must account for. These include phenomena like nonsequential rendering, movable diacritics, positional variants, ligatures, conjuncts, and kerning.
Section 2, "A general solution to the complexities of character rendering," proposes a general solution to the rendering of scripts that can be printed typographically. (The computational rendering of calligraphic scripts adds further complexities which are not addressed.) The author first argues that the proper modeling of writing systems requires a two-level system in which a functional level is distinguished from a formal level. The functional level is the domain of characters (which represent the underlying information units of the writing system). The formal level is the domain of graphs (which represent the distinct graphic signs which appear on the surface). The claim is then made that all the phenomena described in section 1 can be handled by mapping from characters to graphs via finite-state transducers - simple machines guaranteed to produce results in linear time. A brief example using the Greek writing system is given.
Section 3, "Toward a conceptual model for multilingual computing," goes beyond graphic rendering to consider the requirements of a system that would adequately deal with other language-specific issues like keyboarding, sorting, transliteration, hyphenation, and the like. The author observes that every piece of textual data stored in a computer is expressed in a particular language, and it is the identity of that language which determines how the data should be rendered, keyboarded, sorted, and so on. He thus argues that a rendering-centered approach which simply develops a universal character set for all languages will not solve the problem of multilingual computing. Using examples from the world's languages, he goes on to define language, script, and writing system as distinct concepts and argues that a complete system for multilingual computing must model all three.
Availability: Offprints of this article are available from the author at the following Internet address: gary.simons@sil.org. The volume itself is available from LACUS, P.O. Box 101, Lake Bluff, IL 60044.
See a related version of the document on the SIL WWW server. For other information on CELLAR, see the main CELLAR page at SIL and (more recently) "Computing Environment for Linguistic, Literary, and Anthropological Research (CELLAR).".
[CR: 19950716]
Simons, Gary F. A Computing Environment for Linguistic, Literary, and Anthropological Research [CELLAR]: Technical Overview. CELLAR Project, Internal Report. Dallas, TX: SIL Academic Computing, July, 1988. Extent: approximately 21 pages. Author's affiliation: Summer Institute of Linguistics, Department of Academic Computing.
"In this document, I propose the conceptual architecture of a computing environment designed to meet the particular needs of linguists, literary scholars, and anthropologists. In short: we need to process textual information which is (1) multilingual, (2) structured, (3) multidimensional, and (4) integrated, with a database manager that is: (1) seamless, (2) self-validating, and (3) knowledge-based, in a user environment which is: (1) extensible and (2) iconic." [from the Introduction]
Available on the SIL WWW server. Other information about CELLAR is accessible from the main CELLAR page.
[CR: 19960715]
Simons, Gary F. A Conceptual Modeling Language for the Analysis and Interpretation of Text. TEI [Text Encoding Initiative] Working Paper AIW1q2, Committee on Text Analysis and Interpretation. Dallas, TX: Academic Computing Department, Summer Institute of Linguistics, March 10 1990. Extent: approximately 29 pages. Author's affiliation: Summer Institute of Linguistics; Email: Gary.Simons@sil.org.
Abstract: "This document proposes a conceptual modeling language which could provide a framework for designing encoding schemes for the linguistic analysis and interpretation of text. Note the focus on 'designing encoding schemes.' The December 1989 meeting of the TEI-ANA committee concluded that the requirements for encoding linguistic analysis of text are considerably more complex than the requirements for encoding the text itself. While the metalanguage built into SGML (namely, the language for document type definitions) is adequate for expressing the design of the encoding for the text itself, it is not adequate for expressing the design of encoding for linguistic analysis. The committee thus concluded that we needed to begin by designing a metalanguage that would allow us to express the design of encoding schemes for text analysis. This document seeks to explain why this is needed and then gives an initial proposal for such a metalanguage with examples from two domains."
Note: some of the design principles articulated in this 1990 working paper find expression in the TEI's feature-structure markup, with its mechansm for feature-structure declaration. On TEI feature structures, see Simons, "Implementing the TEI's Feature-Structure Markup by Direct Mapping to the Objects and Attributes of an Object-Oriented Database System", below.
[CR: 19970421]
Simons, Gary F. "Conceptual Modeling Versus Visual Modeling: A Technological Key to Building Consensus." Computers and the Humanities (CHUM) 30/4 (1996/1997) 303-319 (with 16 references). ISSN: 0010-4817. Author's affiliation: Director of Academic Computing, Summer Institute of Linguistics; Email: Gary.Simons@sil.org.
Abstract: "Debate has long been a hallmark of the academic endeavor. The recent introduction of computers into academic life has not been the deus ex machina to bring sudden resolution to these debates. There is a new computing technology, however, that has some promise in this regard. It is called conceptual modeling. This paper (see endnotes) demonstrates how a computer-based model of a problem domain can lead to consensus when competing approaches to the domain can be encapsulated in different visual models that are applied to the same underlying conceptual model."
This published article is based upon the author's presentation at the 1994 Paris ACH/ALLC Meeting, "Consensus ex Machina" (Joint International Conference of the Association for Literary and Linguistic Computing and the Association for Computing and the Humanities Paris, 19 - 23 April 1994).
A version of the paper is also available online: connect via HTML client to the SIL WWW server (http://www.sil.org/cellar/ach94/ach94.html). See the associated bibliographic entry for discussion of the article's particular relevance to SGML.
[CR: 19950716]
Simons, Gary F. "Conceptual Modeling Versus Visual Modeling: A Technological Key to Building Consensus." Pages 217-218 [partial abstract] in Colloque International "Consensus ex Machina?" Abstracts. International Joint Conference of the ALLC (Association for Linguistic and Literary Computing) and ACH (Association for Computers and the Humanities), Sorbonne, Paris, 19-23 avril 1994. Paris: Laboratoire "Lexicométrie et textes politiques" (INaLF, CNRS), and Ecole Normale Supérieure de Fontenay - Saint Cloud, 1994. Extent: 244 pages. Author Affiliation: Summer Institute of Linguistics, Department of Academic Computing.
The paper does not treat SGML as a central issue, but demonstrates how an SGML view (linear representation) of linguistic information can be generated from an object-oriented knowledgebase which understands the data in its own terms semantically, and how to render the information with SGML tag and attribute according to a DTD.
The full text of the presentation is to appear in a volume of the series Research in Humanitites Computing (Oxford University Press). It is also currently available online: connect via HTML client to the SIL WWW server (http://www.sil.org/cellar/ach94/ach94.html). [Note also the report on the ALLC/ACH '94 Conference.]
[CR: 19960714]
Simons, Gary F. "Implementing the TEI's Feature-Structure Markup by Direct Mapping to the Objects and Attributes of an Object-Oriented Database System." Pages 111-114 [extended abstract] in ACH/ALLC '95: The 1995 Joint International Conference. Conference Abstracts, Posters and Demonstrations. ACH/ALLC '95 Joint International Conference, July 11-15, 1995. Santa Barbara, California: University of California/ACH/ALLC, 1995.
The paper describes "how a generalized implementation of TEI feature-structure markup has been achieved by extending an object-oriented database system [CELLAR] to use TEI-style <fs> [feature-structure] tagging as a possible format for the representation of its objects." Initial points in summary: (1) "Feature structures can encode information of nearly any sort. This is because they are just instances of the more general data structure referred to by Donald Knuth as 'nodes' (here called feature structures) and 'fields' (here called features) [D. E. Knuth, The Art of Computer Programming 1:462, 1968]. . ." (2) "Feature structures with features are thus analogous to records with fields, objects with attributes, frames with slots, property lists with properties, and abstract data types with access functions. . ."
From the document Conclusion: "This paper has demonstrated that:" (1) "The TEI's feature-structure markup can be implemented by direct mapping onto the objects and attributes of an object-oriented database system." (2) "The CELLAR system, with its user-definable views for export formatting and parsers for import processing, has proven able to do this task." (3) "The FSD [Feature Structure Declaration of TEI} formalism has the potential for serving as a lingua franca among database systems for the interchange of basic data models."
See the article by D. Terence Langendoen and Gary Simons, "Rationale for the TEI Recommendations for Feature-Structure Markup," pages 191-209 in The Text Encoding Initiative: Background and Contents, edited by Nancy Ide and Jean Véronis [= Computers and the Humanities 29/3, 1995]. The feature structure notation is defined for the TEI in chapter 16 of the Guidelines for Electronic Text Encoding and Interchange; link to chapter 16 online via Electronic Book Technologies or link via UVA.
[CR: 19971125]
Simons, Gary F. Importing SGML data into CELLAR by means of architectural forms. SIL Academic Computing Working Paper. Dallas, TX: Summer Institute of Linguistics, November 12, 1997. Extent: approximately 7 pages [main document] with several subsidiary documents. Author's affiliation: SIL Academic Computing; Email: Gary.Simons@sil.org.
Abstract: "This working paper documents a process for importing SGML data into the CELLAR database. The process, which requires no change to the SGML data and no special-purpose programming on the CELLAR side, is based on a relatively new SGML feature named architectural forms. The user writes a meta-DTD that maps the elements in the SGML data onto architectural forms that express the corresponding objects and attributes in CELLAR. Then an SGML parser uses this to create an 'architectural document' that an existing CELLAR parser reads to build the corresponding structure of objects in the CELLAR database."
"This electronic working paper gives the full details of work that has been presented in two conference papers. Provisional references: (1) Proceedings of SGML/XML '97, Washington, D. C., 8-11 December 1997, and (2) "Using Architectural Forms to Map TEI Data Into an Object-oriented System," in TEI Tenth Anniversary Users' Conference: Conference Abstracts, Providence, R.I., 14-16 November 1997. The abstract for this TEI10 document is available online: http://www.stg.brown.edu/webs/tei10/tei10.papers/Simonspaper.html; [local archive copy]; see also the full bibliography entry.
The working paper is available online in HTML format. Further information about the CELLAR Project may be found on the SIL server. For other information on SGML architectures, see the database entry Architectural Forms and SGML Architectures.
[CR: 19971018]
Simons, Gary F. "Mapping from objects to markup: a springboard for multiple-strategy electronic publishing." Pages 151 - 153 in ACH-ALLC '97. The 1997 Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing. Conference Abstracts. ACH-ALLC '97. Queen's University at Kingston, Ontario, Canada. June 3 - 7, 1997. Compiled by Greg Lessard and Michael Levison. Ontario, Canada: Queen's University, 1997. ISBN: 0-88911-760-8. Author's affiliation: Summer Institute of Linguistics, Email: gary.simons@sil.org .
[Extract:] "This paper reports on the experience of the Summer Institute of Linguistics in developing electronic publishing solutions for its LinguaLinks product (SIL 1996). LinguaLinks is an electronic performance support system designed to assist field workers with a wide range of tasks related to language learning, language analysis, and language development. The paper first introduces the LinguaLinks model of performance support and CELLAR -- the object-oriented database system that is used to implement it. Our approach to electronic publishing is to first build the information as a structure of objects in the database, and then to use multiple CELLAR stylesheets to map the information onto multiple markup schemes. The object database thus serves as a springboard that allows us to vault the information into any number of formats for publishing. The paper illustrates this approach to electronic publishing by focusing on one application area that LinguaLinks supports, namely, lexical database development. It first shows how the tutorial and reference documents that give help on how to build a dictionary are mapped onto different markup schemes for publication as a Folio Views infobase, a Windows help system, and an HTML Web document. It then shows how the dictionaries that are built by using LinguaLinks are mapped onto HTML markup to provide a display format on the Web and onto TEI markup to provide a richer format for information interchange and archiving."
Abstract available online in HTML format: "Mapping from objects to markup: a springboard for multiple-strategy electronic publishing", by Gary F. Simons; [archive copy]. Further information on CELLAR is available via the SIL Web server. Note that the author will present a paper at the SGML/XML '97 Conference on the use of architectural forms to achieve mapping of SGML data into databases: "Using architectural forms to map SGML data into an object-oriented database."
Additional information on the ACH-ALLC '97 Conference is available in the SGML/XML Web Page main conference entry, or [August 1997] via the Queen's University WWW server.
[CR: 19980606]
Simons, Gary F. "The Nature of Linguistic Data and the Requirements of a Computing Environment for Linguistic Research." Pages 10-25 (Chapter 1) in Using Computers in Linguistics. A Practical Guide. Colloquium: Computing and the
Ordinary Working Linguist [Linguistic Society of America]. Philadelphia, 1992. Edited by John Lawler (Program in Linguistics, University of Michigan) and Helen Aristar Dry (Linguistics Program, Eastern Michigan University). London/New York: Routledge, [March] 1998. ISBN: 0-415-16792-2 (hardback) and 0-415-16793-0 (paper). Author's affiliation: Gary F. Simons is the Director of Academic Computing in the Summer Institute of Linguistics.
Summary: Simons "discusses language data and the special demands which it makes on computational resources. As Simons puts it: 1) The data are multilingual, so the computing environment must be able to keep track of what language each datum is in, and then display and process it accordingly; 2) The data in text unfold sequentially, so the computing environment must be able to represent the text in proper sequence; 3) The data are hierarchically structured, so the computing environment must be able to build hierarchical structures of arbitrary depth; 4) The data are multidimensional, so the computing environment must be able to attach many kinds of analysis and interpretation to a single datum; 5) The data are highly integrated, so the computing environment must be able to store and follow associative links between related pieces of data; 6) While doing all of the above to model the information structure of the data correctly, the computing environment must be able to present conventionally formatted displays of the data. This chapter prefigures most of the major themes that surface in the other chapters [of the book], and contains some discussion of the CELLAR prototype computing environment now under development by SIL. It should be read first, and in our opinion it should be required reading for anyone planning a research career in linguistics." [from the volume editors]
Simons discusses SGML in section 1.3, "The Hierarchical Nature of Linguistic Data." See the online Appendix for this chapter, with many links to Internet resources. A related version of the full paper is also online: see the following bibliographic entry.
An introduction and overview of the book may be found on the Routledge web site and [provisionally] at the University of Michigan. An online Table of Contents is provided, as well as an online appendix for each chapter in the book.
[CR: 19990114]
Simons, Gary F. The Nature of Linguistic Data and the Requirements of a Computing Environment for Linguistic Research. Paper accepted for publication in: Computers and the Ordinary Working Linguist, edited by John Lawler and Helen Dry. Draft of 28 July 1993. Dallas, TX: SIL, Academic Computing Department, July, 1993.
"This paper was originally drafted in 1993 as a chapter for a book proposed by Lawler and Dry, Computers and the Ordinary Working Linguist. The version presented here is a revision that was published in 1996 as an article in the journal Dutch Studies on Near Eastern Languages and Literature, volume 2, number 1, pages 111-128. (Note, however, that the bibliography has been annotated to add Web links and updated to report the eventual details of works originally cited as 'forthcoming'.) The book finally came out in 1998 with a new title and the paper was further revised and expanded by about 20%. The citation for the full published version is: Simons, Gary F. 1998. The nature of linguistic data and the requirements of a computing environment for linguistic research. In Using Computers in Linguistics: a practical guide, John M. Lawler and Helen Aristar Dry (eds.). London and New York: Routledge. Pages 10-25. Routledge maintains a Web site for the book which includes an on-line appendix that gives links to many information resources that are relevant to topics covered in this paper."
Section 3 of the document ("The hierarchical nature of linguistic data") discusses the important role played by SGML in focusing attention upon the hierarchical nature of many literary and linguistic data.
The paper is available in HTML format on the SIL WWW server. See also the preceding bibliography entry.
[CR: 19971227]
Simons, Gary F. "Using Architectural Forms to Map SGML Data Into an Object-Oriented Database." Pages 449-460 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Gary F. Simons]: Director of Academic Computing, Summer Institute of Linguistics, Dallas, TX 75236; Email: gary.simons@sil.org; Phone: +1 (972) 708-7418; FAX: +1 (972) 708-7363.
Abstract: "This paper develops a solution to the problem of importing existing SGML data into an existing object-oriented database schema without changing the SGML data or the database schema. After investigating the general problem of where the mismatch lies between the SGML model and the object model, the paper proposes a solution based on architectural processing. Two meta-DTDs are used, one to define the architectural forms for the object model and another to map the existing SGML data onto those forms."
"Much of the promise of SGML lies in the fact that descriptively marked up data can be used by multiple applications. Given the fact that an SGML DTD has much in common with the conceptual model that results from an object-oriented analysis of a problem domain, it is logical to conclude that SGML data should be particularly amenable to being imported into software that uses an object-oriented data model. This is not a trivial task, however, since there are some fundamental differences between the SGML model of data and the object model.
"The paper explores this general problem as it develops a solution to a more specific problem, namely, how to import existing SGML data into an existing object-oriented database schema without changing either the SGML data or the database schema. The target system is an object-oriented database system named CELLAR (for Computing Environment for Linguistic, Literary, and Anthropological Research). The solution uses architectural processing to map the SGML data onto architectural forms that the CELLAR system can use to construct the appropriate structure of objects."
This paper was delivered as part of the "Expert" track in the SGML/XML '97 Conference.
See the related online presentation by G. Simons, Importing SGML data into CELLAR by means of architectural forms, published as an SIL Academic Computing Working Paper; also, "Using Architectural Forms to Map TEI Data Into an Object-oriented System,", as pages 123-129 in TEI 10: A Conference in Celebration of the Tenth Anniversary of the Text Encoding Initiative: Abstracts, from the conference of November 14-16, 1997 at Brown University.
Further information on architectural forms processing and SGML architectures is available in the dedicated database section of the SGML/XML Web Page, "Architectural Forms and SGML Architectures."
Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).
[CR: 19971205]
Simons, Gary F. "Using Architectural Forms to Map TEI Data Into an Object-oriented System." Pages 123-129 in TEI 10: A Conference in Celebration of the Tenth Anniversary of the Text Encoding Initiative. Abstracts.. TEI 10: Text Encoding Initiative, Tenth Anniversary User Conference , Brown University, Providence, Rhode Island. November 14-16, 1997. Sponsored by Martin Hensel Corporation, Kluwer Academic Publishers, and MIT Press. Hosted by Brown University Library, and Computing and Information Services. Providence, RI: Brown University, 1977. Author's affiliation: Department of Academic Computing, Summer Institute of Linguistics; Email: Gary.Simons@sil.org.
Abstract: "This paper develops a solution to the problem of importing existing TEI data into an existing object-oriented database schema without changing the TEI data or the database schema. After investigating the general problem of where the mismatch lies between the SGML model and the object model, the paper proposes a solution based on architectural processing. Two meta-DTDs are used, one to define the architectural forms for the object model and another to map the existing SGML data onto those forms. A full example using a critical text in TEI markup is developed."
[from the Introduction]: "The paper explores this general problem as it develops a solution to a more specific problem, namely, how to import existing SGML data into an existing object-oriented database schema without changing either the SGML data or the database schema. The target system is an object-oriented database system named CELLAR (for Computing Environment for Linguistic, Literary, and Anthropological Research). The solution uses architectural processing to map the SGML data onto architectural forms that the CELLAR system can use to construct the appropriate structure of objects.
Section 1 of the paper discusses the basic differences between the SGML model of data and the object model, and illustrates why the mapping from SGML elements to objects is not a trivial one. Section 2 introduces the DTD for an architecture that maps SGML data onto objects. Section 3 gives a complete example of the automated process by which the SGML data are mapped onto this architectural DTD via an intermediate meta-DTD that encodes the mapping. The example used is that of a critical text edition encoded in TEI format. Finally, section 4 discusses the implementation and the results that have been achieved thus far.
The extended abstract for the document is available online: http://www.stg.brown.edu/webs/tei10/tei10.papers/Simonspaper.html; [local archive copy]. See the main database entry for additional information about the conference, or the Brown University web site.
A related paper Importing SGML data into CELLAR by means of architectural forms is available in HTML format: see http://www.sil.org/cellar/import/. For other information on SGML architectures, see the database entry Architectural Forms and SGML Architectures.
[CR: 19990111]
Simons, Gary. "Using Architectural Processing to Derive Small, Problem-Specific XML Applications from Large, Widely-Used SGML Applications." Pages 51-60 in Markup Technologies '98 Conference Proceedings. Markup Technologies '98 Conference. Hyatt Regency, McCormick Place, Chicago, Illinois, USA. November 19 - 20, 1998. Sponsored by GCA and co-sponsored by MIT Press. Edited by the program chairs, B. Tommie Usdin, Debbie Lapeyre, and Michael Sperberg-McQueen. Alexandria, VA: Graphic Communications Association (GCA), 1998. Author's affiliation: Director of Academic Computing, Summer Institute of Computing.
Abstract: "The large SGML DTDs in widespread use (e.g. HTML, DocBook, CALS, EAD, TEI) offer the advantage of standardization, but for a particular project they often carry the disadvantage of being too large or too general. A given project might be better served by a DTD that is no bigger than is needed to solve the specific problem at hand, and that is even customized to meet special requirements of the problem domain. Furthermore, the project might prefer for the data it produces to meet the different syntactic constraints of XML conformity. This paper demonstrates how architectural processing can be used to develop a problem-specific XML DTD for a particular project without losing the advantage of conforming to a widely used SGML DTD. As an example, the paper develops a small XML application derived from the Text Encoding Initiative DTD. The TEI Guidelines offer a mechanism for building TEI-conformant applications; the paper concludes by proposing an alternative approach to TEI conformance based on architectures."
Keywords: computing, humanities computing, SGML, XML, architectural forms, DTD design, conformance of derived DTDs, TEI (Text Encoding Initiative), lexicography, dictionary, Sikaiana language, Solomon Islands.
An online copy of this paper (HTML) is available in the SIL Electronic Working Papers Series.
Full abstracts and annotations for other presentations given at the Markup Technologies '98 Conference are provided in a separate document.
[CR: 19971216]
Simons, Gary F.; Thomson, John V. "Multilingual data processing in the CELLAR environment." Pages 203-234 in Linguistic Databases. [Conference on] Linguistic Databases. Centre for Language and Cognition and Centre for Behavioral and Cognitive Neuroscience, University of Groningen, Groningen, The Netherlands. March 23-24, 1995. Sponsored by the Dutch National Science Foundation (NWO), Royal Dutch Academy of Science (KNAW), et al.. Edited by John Nerbonne (Computational Linguistics, and Humanities Computing, University of Groningen). CSLI Lecture Notes, Number 77. Stanford, CA: Center for the Study of Language and Information, 1998. ISBN: 1-57586-093-7 (hardback), 1-57586-092-9 (paper). Authors' affiliation: SIL Academic Computing.
Abstract: "This paper describes a database system developed by the Summer Institute of Linguistics to be truly multilingual. It is named CELLAR--Computing Environment for Linguistics, Literary, and Anthropological Research. After elaborating some of the key problems of multilingual computing (section 1), the paper gives a general introduction to the CELLAR system (section 2). CELLAR's approach to multilingualism is then described in terms of six facets of multilingual computing (section 3). The remaining sections of the paper describe details of how CELLAR supports multilingual data processing by presenting the conceptual models for the on-line definitions of multilingual resources."
[CR: 19950716]
Simons, Gary F.; Thomson, John V. Multilingual data processing in the CELLAR environment. Paper presented at: Linguistic Databases, 23-24 March 1995, University of Groningen, Centre for Language and Cognition and Centre for Behavioral and Cognitive Neurosciences. Dallas, TX: SIL Academic Computing, July, 1995. Extent: 99K, approximately 46 pages; 10 figures. Authors' affiliation: SIL Academic Computing, CELLAR Project.
Abstract: "This paper describes a database system developed by the Summer Institute of Linguistics to be truly multilingual. It is named CELLAR--Computing Environment for Linguistics, Literary, and Anthropological Research. After elaborating some of the key problems of multilingual computing (section 1), the paper gives a general introduction to the CELLAR system (section 2). CELLAR's approach to multilingualism is then described in terms of six facets of multilingual computing (section 3). The remaining sections of the paper describe details of how CELLAR supports multilingual data processing by presenting the conceptual models for the on-line definitions of multilingual resources."
The paper is only marginally relevant to SGML, but aligns itself philosophically with many of the central impulses of SGML. SGML is in fact used in the CELLAR Project in several ways (within encoding models), as will be more clearly illustrated in another paper by Simons (bibliographic reference; direct link].
Available on the SIL WWW server. See other information on CELLAR on the main CELLAR page.
Sirrine, Susan. "What is SGML?" InfoWorld 15/13 (March 29 1993) 76-[?].
"Abstract: The Standard Generalized Markup Language (SGML), a methodology for modeling document contents and identifying structural and content elements, has been established as an international standard. An SGML document consists of 3 main elements: (1) the SGML Declaration, a header that establishes the environment, (2) the 2nd is the Document Type Definition, which is like a template of tags identifying the document's structural and contextual elements and the relationship between the elements, (3) the Document Instance, which is the actual marked-up text. The ultimate impact of SGML is far-reaching. The ability to store information centrally solves the problem of keeping it current. It also makes on-demand publishing possible. Both Frame and Interleaf market structured-document and SGML application-development software, and a PC version is expected from both companies by the end of 1993. WordPerfect has also just shipped Intellitag, its Unix version of a package that offers SGML conversion capabilities."
[CR: 19971125]
Skinner, Eric. "Making SGML Easier with Microdocument Databases." Page(s) 319 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Senior Program Manager, OmniMark Technologies Corporation, Canada.
Abstract: "The abilities to deliver vast amounts of corporate information on-line in real time, with sophisticated hypertext navigation aids, and the accelerating system complexity of products and corporate processes have converged to drive a new paradigm: component-based documentation development. The microdocument architecture is a vendor-independent hybrid of SGML and RDBMS methodologies that enables the delivery of personalized virtual documents. Illustrations of successful virtual document implementations and an overview of business and project leader implementation issues are provided."
Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.
[CR: 19970212]
Skinner, Eric; LaSalle, Benoit. "Using Micro-documents and Hybrid Distributed DataBases for Building up Hypertext-rich Content On-line Servers." In: Proceedings of the 3rd Annual Conference on the Practical Use of SGML. "A Decade of Power." Third Annual [Belux] Conference on the Practical Use of SGML. Business Faculty, Sint-Lendriksborre 6, Brussels, Belgium. October 31, 1996. Sponsored by SGML Belux (Belgian-Luxembourg Chapter of the International SGML Users' Group). Leuven, Belgium: Belux, 1996. Author's affiliation: OmniMark Technologies.
Summary: "A presentation of the Hybrid Distributed DataBase : Modelling the information into information units (SGML micro-documents) in combination with RDBMS and Full-text retrieval engines."
See also Skinner and McFadden, "Microdocument Database Architectures," published in <TAG> 1996, with other references. For further information on the conference, see: (1) the description in the conference announcement and call for papers, and (2) the full program listing, or (3) the main conference entry in the SGML/XML Web Page.
[CR: 19961107]
Skinner, Eric; McFadden, John. "Microdocument Database Architectures." <TAG> 9/10 (October 1996) 1-7. ISSN: 1067-9197. Authors' affiliation: [Skinner]: Senior Program Manager, OmniMark Technologies Corporation; [McFadden]: CEO and Founder, OmniMark Technologies Corporation.
"The Microdocument Database (MDDB) is a conceptual model for a system that can deliver user-independent virtual documents. In MDDB, the strengths of SGML are combined with the proven flexibility of relational databases, creating a hybrid data structure. Narrative text is organized into independent information units called microdocuments. Related data objects and and dependencies between microdocuments are expressed in the database schema. Inside a microdocument, SGML markup is used to encode the structure internal to the contained text." [extracted]
Apropos of the 'Microdocument Database (MDDB),' see on the OmniMark WWW server: (1) "OmniMark and the Hybrid Distributed Database Model", and (2) "OmniMark and the Automation of Internet Publishing".
[CR: 19950716]
Sklar, David. "Accelerating Conversion to SGML via the Rainbow Format." <TAG> 7/1 (January 1994) 4-5. ISSN: 1067-9197.
The article describes "up-translation" of data from proprietary formats produced by word-processor or desktop-publishing software to generic SGML encoding. A number of SGML vendors, including EBT (Electronic Book Technologies), have designed an SGML format that can be used as a target for "up-translation." From that format, data can be moved to other industry-standard (SGML) formats, or directly to SGML-compliant applications which can read Rainbow. See the entry for Rainbow in this database.
[CR: 19950716]
Sloan, D. "Aspects of Music Representation in HyTime/SMDL." Computer Music Journal 17/4 (Winter 1993) 51-59 (with 2 references). Author's affiliation: Department of Music, Ashland University, OH, USA.
"Abstract: In 1986, the American National Standards Institute (ANSI) authorized a working group, X3V1.8M, to study the development of a standard for the computer representation of musical information. The work of this group has led to two related standards: Hypermedia/Time-based Structuring Language, HyTime (S. R. Newcomb et al., 1991) and Standard Music Description Language, SMDL. HyTime is a standard for scheduling and addressing in any medium, music or otherwise, while SMDL covers those aspects specific to music. ANSI has proposed both HyTime and SMDL as ISO standards. HyTime has been approved and will shortly be published with the number ISO/IEC IS 10744:1992. SMDL is still in the committee draft stage and has been given the number ISO/IEC CD 10743. There has been much vigorous debate in the computer music community over the work of X3V1.8M. Some have argued that there are de facto standards already in use, obviating the need for a new language. Others have debated the design chosen by the ANSI committee. Still others do not believe that the music community will enjoy more benefit than harm from having a standard at this point in time."
Smit, G. de V.; Cowan, D. D. Manipulating Partial Documents in a Syntax-Directed Environment. Technical Report CS-90-02. Waterloo, Ontario: Department of Computer Science, University of Waterloo, Waterloo, Ontario, Canada, January, 1990.
Smith, Craig. "Beyond Document Structure - SGML as a Software Development Tool." Pages 139-144 (with 8 references) in PROTEXT IV. Proceedings of the Fourth International Conference on Text Processing Systems. International Conference on Text Processing Systems, Boston, MA, USA 20-22 October 1987. Sponsored by INCA - Institute for Numerical Computation and Analysis. Edited by John J. H. Miller. Dun Laoghaire, Ireland: Boole Press, Ltd., 1987. vii + 153 pages. ISBN: 0-906783-80-1 (hardback); 0-906783-79-8 (paperback). Author's affiliation: Gesellschaft für Mathematik und Datenverarbeitung, Berlin, West Germany.
Abstract: The paper shows how the document description standard SGML can be applied in software development. It is shown how this can be advantageous when building applications of SGML.
[CR: 19971202]
Smith, David A. "Textual Variation and Version Control in the TEI." Pages 131-136 in TEI 10: A Conference in Celebration of the Tenth Anniversary of the Text Encoding Initiative. Abstracts.. TEI 10: Text Encoding Initiative, Tenth Anniversary User Conference , Brown University, Providence, Rhode Island. November 14-16, 1997. Sponsored by Martin Hensel Corporation, Kluwer Academic Publishers, and MIT Press. Hosted by Brown University Library, and Computing and Information Services. Providence, RI: Brown University, 1977. Author's affiliation: Perseus Project, Tufts University; Email: dasmith@perseus.tufts.edu.
Summary: "The Text Encoding Initiative Guidelines for encoding critical apparatus (Chapter 19) draw heavily on the text collation tradition and provide useful tools for basic text variation at the word and character level, but they fail to address the need for encoding variation in text structures other than, or larger than, the words and punctuation of a document. With software version control systems, the problem is often reversed: multiple variants within one line are represented as if they were one. The principles behind the design of software version control systems, nevertheless, can inform our work with tagging textual variants, and lead to some solutions for tagging larger structural variation. These problems with version control and textual variation presented themselves in my work for the Perseus Project, and Perseus texts will illustrate the principal issues. [...] The ease with which we can represent this sort of inter-variant communication makes SGML and the TEI Guidelines a good basis on which to build a textual variant system, which more closely meets the needs of the editors of variant literary texts than available version control systems. With some extensions, the TEI can be made to encode more sophisticated variant structures and to satisfy the requirements, though not the efficiency, of a full-fledged version control system." [extracted]
A Marlowe web site is "currently under construction at Tufts University as part of the Perseus Project, a digital library for the study of ancient Greece and Rome. This SGML-encoded edition of the complete works of Christopher Marlowe and his sources has been produced according to TEI standards."
The extended abstract for the document is available online: http://www.stg.brown.edu/webs/tei10/tei10.papers/smith.html; [local archive copy]. See the main database entry for additional information about the conference, or the Brown University web site.
[CR: 19961226]
Smith, Holly. "SGML Users' Groups...Who Needs 'Em Anyway." Pages 653-658 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Lexicon Systems, Inc., 6165 Lehman Drive, Suite 204, Colorado Springs, Colorado 80918, USA; Tel: 719-593-8971; FAX: 719-593-9268; Email: hollyd@lexisys.com; WWW: http://www.lexisys.com.
Abstract: "As the SGML community continues to grow, users are seeking new support structures, new sources of information, new technology, and new ways of applying SGML. The result is a number of emerging SGML interest groups, not just around the U.S., but around the world. Just over a year ago, I helped revive the defunct Rocky Mountain SGML Users' Group in Colorado. The journey to a strong, productive users' group has been long, and not without hurdles. However, the benefits are many for everyone involved, and the learning experiences have been invaluable. This paper presents ten good reasons to start an SGML users' group, who should be involved in organizing a users' group, how to get started on the right foot, what people can expect to happen during different stages of users' group development, common problems that tend to crop up and how to deal with them effectively, and the dos and don'ts of managing a users' group.
Another paper discussing the role and operation of SGML user groups was presented at SGML '96 by Richard Barth.
Note: The above presentation was part of the "And More..." track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
[CR: 19951113]
Smith, Joan M. "The Computer and Publishing: An Opportunity for New Methodology." Pages 107-113 in PROTEXT II. Proceedings of the Second International Conference on Text Processing Systems. International Conference on Text Processing Systems, Dublin, Ireland 23-25 October 1985.. Edited by John J. H. Miller. Dublin, Ireland: Dún Laoghaire, Boole Press, Ltd., 1987. vii + 215 pages. ISBN: 0-906783-50-X (hardback); 0-906783-53-4 (paperback).
"Abstract: Computers and associated devices are used increasingly for the input of copy on a word processor or other text entry system; perhaps sending a copy to a reference who may return an edited form, possibly using a floppy disk; and maybe for this copy to have codes inserted in it before its publication. In general, these codes have related to specific typesetters: they are device-dependent. But generic codes could be inserted, giving increases flexibility. The changing face of publishing is examined, not only computer-assisted publishing and electronic publishing but above all database publishing. Its relevance to publishers in the more traditional sense and those involved with in-house publishing is considered. The Standard Generalized Markup Language (SGML) is presented as the solution."
[CR: 19970314]
Smith, Joan M "A Report of the MarkUp '88 Events." SGML Users' Group Bulletin 3/2 (1988) 62-66. ISSN: 0269-2538. Author's affiliation: Independent Information Consultant.
The author reports on the highlights of the MarkUp conference sponsored by GCA. It was held in Ottawa Ontario, on May 24-26, 1988. Another report of the conference is available in "The MarkUp '88 Conference", published in the SGML Users' Group Newsletter Number 9 (August 1988) 13-14.
[CR: 19961210]
Smith, Joan M. "Report on [International] MarkUp '89 [Conference]." SGML Users' Group Bulletin 4/1 (1989) 39-42. ISSN: 0269-2538. Author's affiliation: [Independent Consultant], 17 Tanza Road, Hampstead, London NW3 2UA, UK.
A detailed account of the Markup '89 Conference sponsored by GCA and the International SGML Users' Group, held in Gmunden, Austria, April 11-14, 1989.
Note: The volume editor for SGML Users' Group Bulletin 4/1 is David W. Penfold (Edgerton Publishing Services, Huddersfield, UK).
[CR: 19980126]
Smith, Joan M. SGML and Related Standards. Document Description and Processing Languages. Ellis Horwood Series in Computers and their Applications. New York/London: Ellis Horwood, 1992. xviii + 152 pages. ISBN: 0-13-806506-3.
The book supplies a valuable survey from the perspective of Joan Smith, who served as a leading SGML advocate in the UK for many years. Smith is an independent consultant, and founder of the International SGML Users' Group. See a publisher's description and the volume and the Table of Contents for a document overview. The volume is available for purchase through the International SGML Users' Group.
See also the book review by Simon Wickes in <TAG> magazine, May 1993.
Smith, Joan M. SGML Products and Services. CALS in Europe SIG, 1990- [various].
A document covering primarily CALS-SGML, produced by Joan Smith for the CALS in Europe SIG. Periodically updated. The cost is approximately 20 UK pounds. Contact: David Ardron, Secretary, CALS in Europe SIG; Ferranti Computer Systems Ltd,; Western Road, Bracknell, Berkshire RG12 1RA; UNITED KINGDOM; TEL: +44-344-483232.
Smith, Joan M. "The Standard Generalized Markup Language (SGML) for Humanities Publishing." Literary and Linguistic Computing 2/3 (1987) 171-175. ISSN: 0268-1145.
Abstract: a new methodology, and the core of which is generic coding, has been developed within the International Organization for Standardization (ISO). This is known as the Standard Generalized Markup Language (SGML). Using SGML, the elements of a document are marked up as to their role, be it a paragraph, an abstract, a note, or whatever; the style of presentation is a separate issue and is not addressed by SGML. These elements can form part of a data base, which can be updated at will. So there is the notion of data base publishing. The Standard Generalized Markup Language is presented as a tool for full-text data base publishing, where the options for output are open, an example being given as a marked up document. Its value for all aspects of humanities publishing is addressed: whether for scholarly papers intended for a journal, books, specialist publications, dictionaries, or biographies, indeed whatever is input to an electronic medium with the intention of being imaged subsequently in some form; whether alone, in part, or in combination with other text. SGML represents an advance in publishing methodology, taking advantage of developing technology. It can be exploited as such in an academic environment to give an added dimension to research publications.
Smith, Joan M. "Standard Generalized Markup Language and Related Standards." Computing Communications 12/2 (April 1989) 80-84. ISSN: 0140-3664. CODEN: COCOD7.
Abstract: Projects developed by the International Organization for Standardization-International Electrotechnical Commission Joint Technical Committee 1-Subcommittee 18-Working Group 8 are described here, with the working group concentrating on the formulation of standards for text description and processing languages in the broader domain of text and office systems. Central to the work of WG 8 is ISO 8879 Standard Generalized Markup Language for the description of the information content of documents. Other standards and technical reports produced by the group support SGML in some way, either directly or indirectly. Their role in office publishing is described, and some information is given about office applications and the products that are available in the marketplace.
Joan Smith has contributed numerous articles covering (SGML) standards updates. E.g., see "Standards," Literary and Linguistic Computing 4/4 (1989) 294-296; "Standards," Literary and Linguistic Computing 4/1 (1989) 57-58; "Standards," Literary and Linguistic Computing 1/3 (1986) 191-192.
Smith, Joan M. The Standard Generalized Markup Language (SGML): Guidelines for Editors and Publishers. British National Bibliography Research Fund, 26. Boston Spa [UK]: British National Library, 1987. ISBN: 0-7123-3111-5. ISSN: 0264-2972.
The abstract for Smith's "Authors" volume (see here) generally pertains to this document as well.
Smith, Joan M. The Standard Generalized Markup Language (SGML): Guidelines for Authors. British National Bibliography Research Fund, 27. Boston Spa [UK]: British National Library, 1987. ISBN: 0-7123-3112-3. ISSN: 0264-2972.
Abstract: These guidelines are for authors of scholarly publications who wish to prepare documents for a publisher on existing text entry devices, word processors and personal computers, adding markup to the text in accordance with the Standard Generalized Markup Language (SGML). A simple approach is adopted, based on the concept of a starter set of tags. An explanation of SGML is given and why markup should be used, and advice provided on what is to be done if the author has a publisher, has not yet got a publisher, or is his or her own publisher. As far as the preparation of the document is concerned, there is advice on keying conventions, when not to use stylistic and formatting characteristics of the system, and conditions under which its features and facilities may be used. The starter set of tags is explained, and how to deal with lists, tables, and figures. Cross referencing is addressed and the preparation of an index -- all with examples. Information is given on how to extend the starter set and how to cope with text the author may not be able to mark up for any reason. How to deal with characters for printing, that cannot be imaged on the text entry device, is explained, also how to use abbreviations for lengthy character strings of a repetitive nature. For all other issues, the author is referred to the publisher, to the companion 'Guidelines for Editors and Publishers', and to the standard itself.
[CR: 19951113]
Smith, Joan M. "The Use of SGML in the Information Market." Pages 63-74 in Protext III. Proceedings of the Third International Conference on Text Processing Systems. International Conference on Text Processing Systems. Trinity College, Dublin. 22-34 October, 1986.. Edited by J. J. H. Miller. Dublin, Ireland: Dún Laoghaire, Co., Boole Press Ltd., January 1987. ISBN: 0-906783-55-0 (hardback); 0-906783-56-9 (paperback).
"Abstract: The Standard Generalized Markup Language (SGML) received the seal of approval of member bodies of the International Organization for Standardization (ISO) and its publication as an international standard is expected at the end of 1986. It is a standard for full-text data base publishing where this includes computer-assisted publishing and electronic publishing. The methodology is such that the marked up text may be exploited to produce a multiplicity of products from the same data base, the markup being such that the text can be printed or displayed at will in a variety of styles. Application of generic coding methods will give rise to greater freedom in publishing where there can be exploitation of a corporate data base. Information is given on the way some of the sectors in the information market are taking up SGML. How SGML may be applied by means of a starter document type is described, where this may readily be modified or extended dependent on the specific application."
Smith, Joan M.; Stutely, Robert S. SGML: The Users' Guide to ISO 8879. Chichester/New York: Ellis Horwood/Halsted, 1988. 173 pages. ISBN: 0-7458-0221-4 (Ellis Horwood). ISBN: 0-470-21126-1 (Halsted); LC CALL NO: QA76.73.S44 S44 1988.
The book's features are as follows: (1) it supplies a list of some 200 syntax productions, in numerical and alphabetical sequence; (2) it gives a combined abbreviation list; (3) it includes highly useful subject indices to ISO 8879 and its annexes (4) it supplies graphic representations for the ISO 8879 character entities; (5) it lists SGML keywords and reserved names. A more complete overview of the book may be found in the SGML Users' Group Newsletter 9 (August 1988) 9.
Smith, MacKenzie. "DynaText: An Electronic Publishing System [Review of Electronic Book Technologies' DynaText program]." Computers and the Humanities 27/5-6 (1993-1994) 415-420. 10 references. Author affiliation: Chicago University, IL, USA/Harvard University.
Abstract: "DynaText is an electronic book publishing system that allows you to produce ready-to-ship books, or collections of books, on a variety of media such as diskette and CDROM. Several computer platforms are supported including UNIX (using X-windows), MS-Windows, and Macintoshes. The complete system consists of a compiler and indexer that allow a publisher to build an electronic book, and a browser that allows readers to display and navigate in the book, and perform searches in the text. It is one of the few publishing systems to take full advantage of SGML, while incorporating popular features of electronic books such as hypertext linking. With DynaText you can take ordinary text, vector and raster graphics, tables, equations, audio and video clips, and add several types of hypertext links, context-sensitive keyword search capabilities, or multiple views of a document. It also has the ability to launch other programs from inside a text and return the reader to the text at a later point. The DynaText publishing system is a complex and sophisticated tool for producing high quality electronic books on most of the major computer platforms. Its requirement of SGML compliant documents as input usually means a longer process before the book can be produced, but also means that you are not tied to the system in the future, since your texts can be ported easily to other platforms and systems. The ability of users to annotate texts and create their own hypertext links seems particularly valuable to humanities text publishers. DynaText's support of the full range of hypertext and windowing features makes it very easy for publishers to design and readers to use. For academics with large corpora to publish this type of system, while expensive, is one of the few reasonable options."
Smith, Norman E. Managing WEB Documents With OmniMark. Paper presented at the 1994 OmniMark User's Group Meeting (OMUG) in Tyson's Corner, Virginia. Oak Ridge, TN: DOE, Office of Scientific and Technical Information, Scientific Applications International Corp., November 6, 1994. Author's affiliation: Norman E. Smith; Science Applications International Corp.; P.O. Box 2501; 301 Laboratory Road; Oak Ridge, TN 37831-2501; (615) 576-2276; Email: smithn@zeus.osti.gov.
"Abstract: The Department of Energy (DOE) Office of Scientific and Technical Information (OSTI) set its World Wide Web (WWW) Server up as a Standard Generalized Markup Language (SGML) application from the very beginning. SGML processing is built around OmniMark. Web HyperText Markup Language (HTML) documents are parsed with OmniMark and SGML syntax errors corrected before being loaded on the production Web Server. Automation of hypertext links is an absolute necessity as the number of documents on a server grows to prevent dangling hyperlinks. SGML provides the automation vehicle for the OSTI Web Server. Hypertext links are managed via SGML and the parsing process. Each document is given a logical name which is set up as an SGML entity reference. The SGML entity contains the Universal Resource Locator (URL) for the document. The OmniMark program substitutes the proper URL for the logical name reference automatically generating valid hyperlinks. The SGML approach has made possible several complete reorganizations of the file structure on the Web Server with minimal impact on either outside access or staff sanity. This paper examines using OmniMark in managing Web Servers from an SGML prospective. This document describes work performed at the DOE Office of Scientific and Technical Information under contract DE-ACO5-91MA40061."
The document is available online from DOE/OSTI, or in mirror copy here.
Smith, Norman E. Managing Web Documents with SGML. DOE/OSTI Research Report. Oak Ridge, TN: DOE, Office of Scientific and Technical Information, Scientific Applications International Corp., 1994 [1995?]. approximately 13 pages. .
"Abstract: The DOE Office of Scientific and Technical Information (OSTI) set its World Wide Web (WWW) Server up as an SGML application from the very beginning. Web HyperText Markup Language (HTML) documents are parsed and SGML syntax errors corrected before being loaded on the production Web Server. Automation of hypertext links is an absolute necessity as the number of documents on a server grows to prevent dangling hyperlinks. SGML provides the automation vehicle for the OSTI Web Server. Hypertext links are managed via SGML and the parsing process. Each document is given a logical name which is set up as an SGML entity reference. The SGML entity contains the Universal Resource Locator (URL) for the document. The SGML parser substitutes the proper URL for the logical name reference automatically generating valid hyperlinks. The SGML approach has made possible several complete reorganizations of the file structure on the Web Server with minimal impact on either outside access or staff sanity. This paper examines the issues of managing Web Servers from an SGML prospective. This document describes work performed at the DOE Office of Scientific and Technical Information under contract DE-ACO5-91MA40061."
Available from the DOE/OSTI WWW server Managing Web Documents With SGML, by Norman E. Smith [or in mirror copy here].
[CR: 19971229]
Smith, Norman E. Practical Guide to SGML Filters. Wordware's Advanced Book Series. : Wordware Computer Books, 1996. Extent: 450 pages. ISBN: 1-55622-511-3 ["$44.96, CN $69.95, AU $95.95"]. Author's affiliation: SAIC; Email: norman.e.smith@cpmx.mail.saic.com.
Abstract: "This book provides comprehensive coverage of this important language of the Internet programming environment including case studies and two disks which contain the OmniMark Sampler, a fully functional commercial SGML parser. Also included with the disks is the PC version of Electronic Book Technologies RTF to Rainbow SGML conversion. Norman Smith, CDP, is a senior systems analyst and programmer for Science Applications International, Corp. with over twenty years of experience with programming with SGML as a specialty. Book With Diskettes."
"This book provides coverage of SGML, including case studies and disks containing the OmniMark Sampler, a fully functional commercial SGML parser. The book is logcially divided into three sections. The first covers background material on writing SGML/HTML file filters. The middle section is a chapter on each of five languages used in the case studies. These languages include AWK, C, OmniMark, Perl, and S-Engine (a Forth-based language). The language coverage is more than a "quick reference", but less than a tutorial. The idea is to present enough of the language to give you a feel for it and to aid understanding of the code in the case studies. The final section is a group of case studies, with implementation in two or more of the five languages. The case studies are: - Structured ASCII to SGML - SGML to HTML - SGML to TeX - SGML to SGML - ASCII to HTML - RTF to SGML - SGML to RTF The disks with the book include a demo copy of OmniMark plus AWK, Perl, Rainbow DTD converter, and all of the code from the book.
A more detailed description of the book is available in an announcement posted to CTS. The accompanying diskettes are [January 17, 1997] available for download from the Wordware server; diskette #1, diskette #2. Also, see pre-publication information: Wordware: http://www.wordware.com/page3.html. Wordware: 1-800-229-4949
[CR: 19961018]
Smith, Philip N.; Brailsford, David F. "Towards Structured, Block-Based PDF." Pages 153-165 (with 23 references) in EP '96. Proceedings of the Sixth International Conference on Electronic Publishing, Document Manipulation and Typography. [ = Journal Special Issue: Electronic Publishing - Origination, Dissemination and Design (EPODD), June & September 1995, Volume 8, Issues 2-3. Sixth International Conference on Electronic Publishing, Document Manipulation and Typography, Palo Alto, California. September 24-26, 1996. Sponsored by Adobe Systems Incorporated; School of Information Management and Systems, University of California at Berkeley; Xerox Corporation. [Proceedings Volume] Edited by Allen Brown, Anne Brüggemann-Klein, and An Feng; [Journal] Editors David F. Brailsford and Richard K. Furuta. Chichester/ New York: John Wiley & Sons, 1996. ISSN: 0894-3982. Authors' affiliation: .
Abstract: "The Portable Document Format (PDF), defined by Adobe Systems Inc. as the basis of its Acrobat product range, is discussed in some detail. Particular emphasis is given to its flexible object-oriented structure, which has yet to be fully exploited. It is currently used to represent not logical structure but simply a series of pages and associated resources."
"A definition of an Encapsulated PDF (EPDF) is presented, in which EPDF blocks carry with them their own resource requirements, together with geometrical and logical information. A block formatter called Juggler is described which can lay out EPDF blocks from various sources onto new pages. Future revisions of PDF supporting uniquely-named EPDF blocks tagged with semantic information would assist in composite-page makeup and could even lead to fully revisable PDF."
For other conference information, see the main conference entry for EP '96, or the brief history of the conference as sixth in a series since 1986. See the volume main bibliographic entry for a linked list of other EP '96 titles relevant to SGML and structured documents.
[CR: 19971125]
Smith, Tracy. "Intuitive SGML: Database Integration in SGML Authoring." Page(s) 119-120 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: Documentation Ststems Consultant, Novell Inc.; Email: trsmith@novell.com.
Abstract: "Authoring in SGML is difficult and time consuming. Creating SGML documents is costly and complex. Although many of the SGML authoring tools available provide superior SGML functionality, many are not intuitive. This paper will discuss Novell's approach to creating structured hypertext documents intuitively and efficiently by integrating and customizing current database and SGML authoring technologies. The main goal of the system Novell developed is to optimize the authors ability to create and manage structured content.
"The focus of the presentation will be a demonstration of the tool Novell developed to solve many of these problems.
Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.
[CR: 19971227]
Smith, Walter. "OpenTag Initiative: Common Data Extraction and Abstraction Method for Translation and NLP Activities." Pages 113-132 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Walter Smith]: International Language Engineering Corporation, 1600 Range Street, Boulder, CO 80301; Email: walters@ile.com.
Abstract: "The OpenTag format proposes to use the power of an open standard (XML) to access valuable information hidden away in private-format files. One of the primary benefits of using the OpenTag format to leverage information is that you don't have to change anything about the way you're currently working. Users of FrameMaker or Interleaf can continue to author and publish in their familiar environments, and still benefit without ever converting to a complete SGML/XML solution. Of course, certain tweaks to your development techniques can maximize your return on information investment. One of the biggest challenges is to efficiently access text when it's embedded within code and other non-textual data in a multitude of different formats, so using a standard method of marking up that extracted text can greatly boost the efficiency with which it can be consistently reused.
"The OpenTag Initiative is a working group in which both localization customers and their suppliers are defining a standard that will support open data encoding methods during the localization process, and permit robust data interchange between suppliers and customers. The OpenTag format is a single common markup format to encode text extracted from documents of varying and arbitrary formats. By abstracting a file's heterogeneous formatting information into OpenTag markup, you can produce homogeneously tagged text files, regardless of the original file format. Rather than converting information from 'format X' into the OpenTag format, data are extracted from 'format X', manipulated in an OpenTag environment, and later merged back into the 'format X' file."
This paper was delivered as part of the "User" track in the SGML/XML '97 Conference.
For more on OpenTag markup, see the dedicated database entry for the OpenTag Initiative, and its relationship to other early 'XML' applications.
Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).
[CR: 19950903]
SoftQuad, Inc. The SGML Primer. SoftQuad's Quick Reference Guide to the Essentials of the Standard: The SGML Needed for Reading a DTD and Marked-up Documents and Discussing them Reasonably. Version 3.0 = Correction and revision of Version 2.0, May 1991. Toronto: SoftQuad Inc., December, 1991. 36 pages.
This SGML Primer from SoftQuad provides a highly readable and even enjoyable introduction to the essential concepts and features of SGML. It may be one of the best brief treatments of SGML you can find -- something you can lend to colleagues without fear of having them turned off by the unavoidable complexity of SGML. The book consciously attempts a popular presentation, using clever illustrations, some surprising examples (structured events in the world of cuisine art, recipe for a biblical mythology), and a bare minimum of technical language. It is available from SoftQuad Inc.; 56 Aberfoyle Crescent, Suite 810; Toronto, Ontario; Canada M8X 2W4; TEL: +1 (416) 239-4801; FAX: +1 (416) 239-7105.
SoftQuad Inc. deserves our thanks for creating the [1995] online edition of the The SGML PRIMER. The paper print version is probably still prettier, but a lot of work has been done using color graphics to make this online version a highly usable SGML introduction. When someone asks for an online crash course in SGML essentials (e.g., "before tomorrow morning at 8:00"), I recommend that you point them to the URLs below. See:(1) SGML Primer: Introduction, and (2) The SGML Primer: Main Text. Or local copy: introduction, main section.
SoftQuad, Inc. The SGML World Tour. Toronto, Ontario: SoftQuad, Inc., Spring, 1994. ISBN: 1-896172-01-6.
This publication is a large and valuable library of SGML resources on CDROM disk. It may be ordered for $24.00 US from SoftQuad). Tel: 1-800-387-2777 (1 416 239-7105). For more on SoftQuad's SGML products, see their WWW home page, and the SGML World Tour Features: A World of SGML Resources on CD-ROM [was/check: description of the SGML World Tour (mirrored here)].
[CR: 19961030]
Soutberg, Jeroen. "SGML and TeX at Elsevier Science Publishers." MAPS (Minutes and Appendices- Nederlandstalige TeX Gebruikersgroep) 5 (November 1990) 85-88.
[Reference is from the PREMIUM Project]
[CR: 19951113]
Southall, Richard. "Presentation Rules and Rules of Composition in the Formatting of Complex Text." Pages 275-290 (with 27 references) in EP [Electronic Publishing] 92: Proceedings of Electronic Publishing, 1992 International Conference on Electronic Publishing, Document Manipulation, and Typography. Swiss Federal Institute of Technology, Lausanne, Switzerland. April 7-10, 1992. Sponsored by the Swiss Federal Institute of Technology and the Swiss National Science Foundation. Edited by Christine Vanoirbeek and Giovanni Coray [EPF, Lausanne, Switzerland]. The Cambridge Series on Electronic Publishing. Cambridge: Cambridge University Press, 1992. ISBN: 0-521-43277-4. Author affiliation: Faculty of Design for Manufacture, London College of Art, UK.
"Abstract: The configuration of the actual document produced when a generically marked-up virtual document is formatted depends on rules of composition which govern the action of the formatting system, as well as on the presentation rules associated with the document. Rules of composition are of two kinds: spacing rules and rules of orthography. Statements of such rules in compositors' manuals from the era of metal-type composition are quoted, and their underlying rationales discussed. The application of rules of cmposition by present-day document formatting systems depends on the explicit delimitation of compositional environments in generically marked-up documents, and on the systems' ability to deal explicitly with visual structure."
[CR: 19950804]
Sperberg-McQueen, C. Michael. "Bare bones TEI: A very very small subset of the TEI Encoding Scheme." Electronic Texts and the Text Encoding Initiative [Special Issue] = TEXT Technology: The Journal of Computer Text Processing 5/3 (Autumn, 1995) 248-265. ISSN: 1053-900X. Author's affiliation: Senior Research Programmer, University of Illinois at Chicago; TEI editor.
"The volume concludes with a simple introduction to the bare bones of the TEI scheme intended to whet the appetite of the reader for a more detailed and thorough exposition. Written by my esteemed colleague and co-editor of the TEI Guidelines, Michael Sperberg-McQueen, it presents the bare essentials of the TEI encoding scheme, in a copiously illustrated and very accessible form, designed specifically for the novice text encoder." [from the issue Introduction, by Lou Burnard]
See the main entry for this special issue of TEXT Technology dedicated to the TEI, edited by Lou Burnard. See also the online version of this particular article.
[CR: 19950716]
Sperberg-McQueen, C. Michael. Bare Bones TEI: A Very Very Small Subset of the TEI Encoding Scheme. TEI Document No. TEI U6. 30 Aug 1994, rev. June 1995. Chicago, IL: University of Illinois at Chicago, June, 1995. Extent: approximately 26 pages.. Author's affiliation: Computer Center, University of Illinois at Chicago.
"Bare Bones TEI: A Very Very Small Subset of the TEI Encoding Scheme (document no. TEI U6) describes a very small set of tags for users first learning the TEI encoding scheme. The tag set described is small enough to be non-threatening, but probably not large enough for serious work with real texts --- it's about the same size as the first versions of HTML. Available [July 1995] in three forms."
Availability: SGML form (using the TEI Lite DTD); HTML form in multiple small files (for faster retrieval); or HTML form in a single file (for easier printing).
[CR: 19971227]
Sperberg-McQueen, Michael. "Closing Keynote." Page 19 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [C. M. Sperberg-McQueen]: University of Illinois at Chicago; also Editor in Chief, Text Encoding Initiative, and Co-editor of the XML specification (with Tim Bray); Email: U35395@UICVM.UIC.EDU; WWW: http://www.uic.edu/~cmsmcq/.
Summary: "The major themes of the conference will be recapitulated with observations on the state of the SGML/XML world. Observations on important or telling events at the conference will be interspersed with opinions on their significance." [watch this space for a link to a published summary]
This presentation was delivered as the Closing Keynote Address at the SGML/XML '97 Conference.
Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).
Sperberg-McQueen, C. Michael. "Specifying Document Structure: Differences in LaTeX and TEI Markup." TUGboat [Proceedings of the 1991 Annual Meeting] 12/3 (December 1991) 415-421.
The article is available in related version as a TEI document, TEI EDW22, June 9, 1991).
Sperberg-McQueen, C. Michael. "The Standard Generalized Markup Language (SGML): A Brief Introduction." Proceedings of the American Society for Information Science = Proceedings of the ASIS annual meeting [56th ASIS Annual Meeting Proceedings of the 56th Annual Meeting of the American Society for Information Science October 24-28, 1993 Columbus, OH] 30 (1993) 285. ISSN: 0044-7870.
Sperberg-McQueen, C. Michael. "The Text Encoding Initiative: Electronic Text Markup for Research." Pages 35-56 in Literary Texts in an Electronic Age: Scholarly Implications and Library Services. A Collection of the Papers Presented at the 1994 Clinic on Library Applications of Data Processing at the Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Clinic on Library Applications of Data Processing, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, April 10-12, 1994. Edited by Brett Sutton. University of Illinois, Urbana-Champaign: The Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, 1994. ISBN: 0-87845096-3. ISSN: 0069-4789.
"Abstract: This paper describes the goals and work of the Text Encoding Initiative (TEI), an international cooperative project to develop and disseminate guidelines for the encoding and interchange of electronic text for research purposes. It begins by outlining some basic problems that arise in the attempt to represent textual material in computers and some problems that arise in the attempt to encourage the sharing and reuse of electronic textual resources. These problems provide the necessary background for a brief review of the origins and organization of the Text Encoding Initiative itself. Next, the paper describes the rationale for the decision of the TEI to use the Standard Generalized Markup Language (SGML) as the basis for its work. Finally, the work accomplished by the TEI is described in general terms, and some attempt is made to clarify what the project has and has not accomplished."
Another abstract for the article is available from ETEXTCTR Review #2 (Jerry Caswell).
Sperberg-McQueen, C. Michael. "Text in the Electronic Age: Textual Study and Text Encoding, with Examples from Medieval Texts." Literary and Linguistic Computing 6/1 (1991) 34-46. ISSN: 0268-1145.
Abstract: This paper discusses characteristic problems in designing methods of encoding texts in machine-readable form for textual study. Any electronic representation of a text embodies specific ideas of what is important in that text. A well-developed encoding scheme is thus in some sense a theory of the texts it is intended to mark up. This paper describes, with examples, the theory implicit in the Text Encoding Initiative (TEI), a project to develop guidelines for the encoding of machine-readable texts. Any machine-readable representation of texts must use markup, but no finite vocabulary of markup items can be complete, since neither the set of textual features worth marking nor the set of texts to be studied is finite. Any useful markup scheme must therefore be extensible. Additionally, a markup scheme must allow several discrete views of texts. Texts are both linguistic and physical objects. They have simultaneously a linear, a hierarchical and a directed-graph structure. They refer to objects in real or fictive universes. Texts, finally, are cultural and thus historical objects: a useful encoding scheme must be able to represent textual variation, parallel texts, and the gradual accretion of interpretation and commentary with which human culture adorns venerated texts.
[CR: 19960330]
Sperberg-McQueen, C. Michael. Textual Criticism and the Text Encoding Initiative. Presentation at MLA '94, San Diego, Session sponsored by Emerging Technologies Committee of MLA. Chicago, IL: Computer Center, University of Illinois at Chicago, December 1994. Extent: approximately 22 pages, 70K HTML file. Author's affiliation: [University of Illinois at Chicago, and TEI Editor].
"In this paper I want to discuss some of the more obvious issues raised by efforts to create electronic texts, and in particular electronic versions of scholarly editions. [Walter] Benjamin's essay ['Das Kunstwerk im Zeitalter seiner technischen Reproduzierbarkeit'] is particularly suggestive here, in the context of efforts to make literary (and non-literary) texts reproducible by new technological methods. I begin by making explicit some of my assumptions about the goals and requirements of electronic scholarly editions; in the second section I explain why my list of requirements says nothing about the choice of software for the preparation and use of scholarly editions. The third section will describe the work and results of the Text Encoding Initiative, a cooperative international project to develop and disseminate guidelines for the creation and interchange of electronic texts, and show how they relate to the requirements for electronic scholarly editions. In the concluding section, I will outline some of the implications of the TEI for electronic and printed scholarly editions, and some essential requirements for any future consensus on how to go about creating useful electronic scholarly editions." [from the document Introduction]
The document is available via the Internet in HTML and (TEI) SGML format. URLs: "Textual Criticism and the Text Encoding Initiative" [HTML]; SGML version; [mirror copy, HTML]. See also the host page, "Miscellaneous Talks and Papers, http://www.uic.edu:80/orgs/tei/misc/.
Sperberg-McQueen, C. Michael. Trip report: CETH Summer Seminar 1995 Posting to TEI-L, Text Encoding Initiative public discussion list. 10:04:52 CDT, Tue, 27 June, 1995. Author Affiliation: ACH/ACL/ALLC Text Encoding Initiative.
"The Center for Electronic Texts in the Humanities at Princeton and Rutgers Universities held its fourth summer seminar earlier this month under the title ELECRONIC TEXTS IN THE HUMANITIES: METHODS AND TOOLS. . ."
See the text of the report in this database or in the TEI-L archives. See also the link to the seminar description.
Sperberg-McQueen, Michael C. Trip Report, Coalition for Networked Information. Task Force Meeting, Washington, D.C. 10-11 April 1995, CNI/AAUP Joint Initiative Workshop, 11-12 April 1995. Posting submitted to TEI-L Mailing List [TEI-L@UICVM.BITNET], 21-April-1995]. April, 1995. approximately 10 pages.
Several presentations in the sessions summarized uses of SGML within the academic/libraries communities. Included are reports on the DLI (Digital Libraries Initiative) Project and the Model Editions Partnership (using TEI-SGML). TEI-SGML and the TEI Header were featured in some of the talks. An online copy of the report is available from this WWW server as well as from the TEI-L archives.
Sperberg, C. Michael. Trip Report: MLA '94, San Diego. Posting submitted to TEI-L Mailing List [TEI-L@UICVM.BITNET], 3-January-1995. December, 1994. approximately 8 pages.
The report treats several (TEI-)SGML matters, including mention of vendors marketing SGML-aware software. Topics: Chadwyck-Healey (English Poetry); Piers Plowman SGML edition; DOE Corpus; Canterbury Tales Project; TEI Guidelines. A copy of the report is available on this WWW server , as well as in the TEI-L archives at UICVM.
Sperberg-McQueen, Michael C. Trip Report: Society for Technical Scholarship, New York City, 6-8 April 1995. Posting to TEI-L (TEI-L@IUCVM.BITNET), "Subject: Trip Report: Society for Technical Scholarship" April 17, 1995.
An online copy of the report is available from the TEI-L archives and from this WWW server.
[CR: 19971018]
Sperberg-McQueen, C. M; Bray, Tim. "Extensible Markup Language (XML)." Pages 160 - 163 in ACH-ALLC '97. The 1997 Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing. Conference Abstracts. ACH-ALLC '97. Queen's University at Kingston, Ontario, Canada. June 3 - 7, 1997. Compiled by Greg Lessard and Michael Levison. Ontario, Canada: Queen's University, 1997. ISBN: 0-88911-760-8. Authors' affiliation: [Sperberg-McQueen]: University of Illinois at Chicago, Email: u35395@uicvm.uic.edu; [Bray]: Textuality, Email: tbray@textuality.com.
[Extract:] "Extensible Markup Language (XML for short) is being designed under the auspices of the World-Wide-Web Consortium (W3C); the larger goal of this effort is 'to enable future Web user agents to receive and process generic SGML in the way that they are now able to receive and process HTML. As in the case of HTML, the implementation of SGML on the Web will require attention not just to structure and content (the domain of SGML per se) but also to link semantics and display semantics.' (See http://www.w3.org/pub/WWW/MarkUp/SGML/Activity for the W3C's description of this activity.) As a subgoal, we are creating an SGML application profile, XML, that is designed to provide many of the benefits of SGML in a lightweight, easy-to-use, easy-to-implement dialect that omits many of the difficult or problematic features of the full standard. This paper is a report on the XML specification; if time allows, some information will also be provided on the progress of the work toward a typology of links and link behaviors. At the time this abstract is prepared, the XML specification has been made public, but is still officially a working draft."
Abstract available online in HTML format: "Extensible Markup Language (XML)", by C. M. Sperberg-McQueen and Tim Bray. Presentation at ACH/ALLC '97. [archive copy]. Further information on the Extensible Markup Language is available in the main XML page.
Additional information on the ACH-ALLC '97 Conference is available in the SGML/XML Web Page main conference entry, or [August 1997] via the Queen's University WWW server.
[CR: 19950823]
Sperberg-McQueen, C. M.; Burnard, Lou. "The Design of the TEI Encoding Scheme." The Text Encoding Initiative: Background and Contents, Guest Editors Nancy Ide and Jean Véronis = Computers and the Humanities 29/1 (1995) 17-39.
Abstract: "This paper discusses the basic design of the encoding scheme described by the Text Encoding Initiative's Guidelines for Electronic Text Encoding and Interchange (TEI document number TEI P3, hereafter simply P3 or the Guidelines). It first reviews the basic design goals of the TEI project and their development during the course of the project. Next, it outlines some basic notions relevant for the design of any markup language and uses those notions to describe the basic structure of the TEI encoding scheme. It also describes briefly the 'core' tag set defined in chapter 6 of P3, and the 'default text structure' defined in chapter 7 of that work. The final section of the paper attempts an evaluation of P3 in the light of its original design goals, and outlines areas in which further work is still needed."
Sperberg-McQueen, C. M.; Burnard, Lou. "The ODD System of Tag Set Documentation." Pages 221-222 [partial abstract] in Colloque International "Consensus ex Machina?". Abstracts International Joint Conference of the ALLC (Association for Linguistic and Literary Computing) and ACH (Association for Computers and the Humanities), Sorbonne, Paris, 19-23 avril 1994. Paris: Laboratorie "Lexicométrie et textes politiques" (INaLF, CNRS), and Ecole Normale Supérieure de Fontenay - Saint Cloud, 1994. 244 pages. Authors' Affiliation: [Sperberg-McQueen] University of Illinois at Chicago; [Burnard] Oxford University Computing Services.
The paper describes a system for the documentation of 'document type definitions' (DTDs) used in SGML. "ODD" stands for "One Document Does it all". The system was developed through research of the Text Encoding Initiative (TEI) and British National Corpus projects. An single "ODD" file is used to generate DTD files, reference documentation for all defined elements and entities, and full documentation of the tag set in running prose. [adapted from the abstract].
[CR: 19960206]
Sperberg-McQueen, C. Michael; Goldstein, R. F. "HTML to the Max: A Manifesto for Adding SGML Intelligence to the World-Wide Web." Computer Networks and ISDN Systems 28/1-2 (December 1995) 3-11 (with 4 references). ISSN: . Authors' affiliation: Computing Center, Illinois University, Chicago, IL, USA.
"Abstract: HTML demonstrates that SGML markup is useful for networked information. How can it be made even more useful? One way is to extend the tag set from HTML to HTML2, etc. We argue for a more radical approach: full SGML awareness in WWW. We believe the difficulties are small, the cost affordable, and the advantages overwhelming. SGML is a metalanguage for defining markup languages; HTML is just one instance of this infinite family. At present, documents in other SGML document types must be translated into HTML for display by a Mosaic client-sometimes this imposes unacceptable information loss. World Wide Web (WWW) browsers could handle other SGML document types without translation by launching a general-purpose SGML browser to view them, as they now launch graphics viewers; a better solution overall would be to build SGML display into the WWW browsers themselves. Either way, display of an SGML document would be controlled by a style sheet using a small number of display primitives ("bold", "line break", etc.) to specify the rendition of each element type. For "well-known" document type definitions (DTDs) like HTML, style sheets could be distributed with the browser, or built in. For other DTDs, the browser would fetch a style sheet from the server. Using style sheets, browser software can also make it easy to customize document display. DTDs and style sheets can be designed to accommodate extensions, ensuring that authors can make small extensions to the tag set with no change whatsoever in the target browsers and virtually no performance penalty."
The paper is based upon a presentation delivered at the Second International World-Wide Web Conference: Mosaic and the Web, Chicago, IL, USA, 17-20 Oct. 1994.
Sperberg-McQueen, C. Michael; Goldstein, Robert F. "HTML to the Max: A Manifesto for Adding SGML Intelligence to the World-Wide Web." Presentation at WWW-2 '94. Chicago. IL.. September 15, 1994. Authors addresses: Michael Sperberg-McQueen: cmsmcq@uic.edu; Robert Goldstein: bobg@uic.edu.
"Abstract: HTML demonstrates that SGML markup is useful for networked information. How can it be made even more useful? One way is to extend the tag set from HTML to HTML2, etc. We argue here for a more radical approach: full SGML awareness in WWW. We believe the difficulties are small, the cost affordable, and the advantages overwhelming.
"SGML is a metalanguage for defining markup languages; HTML is just one instance of this infinite family. At present, documents in other SGML document types must be translated into HTML for display by a Mosaic client --- sometimes this imposes unacceptable information loss.
"WWW browsers could handle other SGML document types without translation by launching a general-purpose SGML browser to view them, as they now launch graphics viewers; a better solution overall would be to buildSGML display into the WWW browsers themselves. Either way, display of an SGML document would be controlled by a style sheet using a small number of display primitives ('bold', 'line break', etc.) to specify the rendition of each element type. For 'well-known' document type definitions (DTDs) like HTML, style sheets could be distributed with the browser, or built in. For other DTDs, the browser would fetch a style sheet from the server. Using style sheets, browser software can also make it easy to customize document display.
"DTDs and style sheets can be designed to accommodate extensions, ensuring that authors can make small extensions to the tag set with no change whatsoever in the target browsers and virtually no performance penalty."
Link to the authoritative version of the document at UIC, or in the online conference electronic proceedings, or see a mirrored copy here.
[CR: 19990519]
Sperberg-McQueen, C. Michael; Usdin, B. Tommie. "Welcome to Markup Languages: Theory & Practice." Markup Languages: Theory & Practice 1/1 (Winter 1999) 1-6. ISSN: 1099-6622 [MIT Press]. Authors' affiliation: [Sperberg-McQueen:] Senior Research Programmer, University of Illinois at Chicago; Email: cmsmcq@uic.edu; [Usdin:] President, Mulberry Technologies Inc.; Email: btusdin@mulberrytech.com; WWW: http://www.mulberrytech.com.
Abstract: "In this introductory 'Commentary and Opinion' essay, the "editors of the journal describe why they and publisher decided to start the journal, and what they hope to accomplish."
'Markup Languages: Theory & Practice is a peer-reviewed technical journal publishing papers on research, development, and practical applications of text markup for computer processing, management, manipulation, and/or display. The scope of the journal includes: 1) design and refinement of systems for text markup and document processing; 2) specific text markup languages; 3) theory of markup design and use; 4) applications of text markup; 5) languages for the manipulation of marked up text.'
"The scope of the journal is wide enough to include current and future markup applications but is designed to limit the subject scope sufficiently to make the journal coherent. As may be seen, the journal is not limited to SGML and XML and their applications, though we believe them to be markup languages of considerable interest. SGML was not the first, and XML is unlikely to be the last, language of their kind; we hope this journal will prove a useful forum for discussions of design and implementation issues relating to markup languages present, past, and future. We hope Markup Languages: Theory & Practice will be equally hospitable to articles on theory and articles on practice. In the field of markup languages, theoretical questions may have immediate and obvious practical implications, and practical problems often raise profound and important theoretical issues. The best theorists continually learn from practical experience; the best implementers realize that there is 'nothing so practical as a good theory'."
"Markup Languages: Theory & Practice will include material of a variety of categories, including: 1) articles: especially on theoretical and practical aspects of markup or markup usage; 2) announcements: describing events or activities, especially future events likely to be of interest to our readers; 3) commentary and opinion: essays, such as this one, consisting primarily of the authors' opinions; 4) practice notes: discussions of common practice, suggestions for improved standard practice, or comparisons of methods for achieving similar goals; 5) project reports: descriptions of a project or application reviews: discussion and description of books, software, web sites, etc. that may take the form of essays, short narrative reviews, or annotated tables of contents; 6) squibs: short (from one to a few pages) statements of fact, descriptions of problems, or anecdotes; 7) standards reports: discussions of any of the ever growing set of standards relating to markup."
For other articles in this issue of MLTP, see the annotated Table of Contents.
[CR: 19950716]
Spivak, Jeffrey. The SGML Primer, First Edition Boyd & Fraser, [forthcoming,] 1996. ISBN: 0-7895-0194-5. Author's affiliation: Datalogics, Inc.
Abstract: "An introduction to the SGML standard for document structure definition, this primer guides the user through new terminology and concepts via description and example. Until now, few texts provided information on SGML in an accessible way. Students can and will embrace this beginner's guide to SGML, which explains the difficult concepts behind this popular standard in a basic, easy-to-grasp fashion.
- Easy to comprehend terminology; important terms concisely defined for beginner
- Real-world examples show SGML in use in businesses
- User-friendly SGML 'shortcuts' and their use makes coding less intensive and easier for any student
- 'Good'' vs. 'Bad'' SGML discussion allows users to avoid common mistakes
- Appendix lists SGML definitions as listed in the ISO standard"[extracted from the publisher's database]
See a fuller description of the book on the Thomson WWW searchable online catalog.
[CR: 19980425]
St. Laurent, Simon. XML: A Primer. Foster City, CA: MIS Press/IDG Books, [February] 1998. Extent: xx + 348 pages. ISBN: 1-5582-8592-X. Author's affiliation: Systems Integration and Support Services Inc., Greensboro, NC.
From the book's back cover: "XML, an important new technology being developed by the World Wide Web Consortium, promises to replace HTML with a stronger, more extensible architecture. A derivative of SGML, XML will give Web designers the power of SGML scripting without the complexity. Developers will be able to manage information with increased power and flexibility not before possible with HTML. This essential guide will show Web developers how to take advantage of this powerful new technology quickly and painlessly. Techniques for integrating XML with new Web technologies such as Dynamic HTML and Cascading Style Sheets are discussed. Readers will learn to create search tools, Document Type Definitions (DTDs), customized tags, and commercial Web solutions. The accompanying Web site (http://www.mispress.com/xml/) includes the latest updates and information to the world of XML, keeping serious developers abreast of evolving technology." See the volume information available from the publisher. Or: the Amazon.Com description, [local archive copy]. Also: "[Book Review of] XML: A Primer." By Dianne Kennedy. In XML Files: The XML Magazine Issue 7 (August 27, 1998).
As of May 1999, the Web site for the book was: http://www.simonstl.com/xmlprim/index.html. The book is also available in a Korean translation (ISBN 898160019-8) from Powerbook
Publishing. As of April 1998, an update page for XML: A Primer had been set up by the author. Or see: http://www.simonstl.com/xmlprim/xmlupdate/. For example, an errata list and an updated section covering the xml:lang and xml:space attributes. See also the author's essay on XML and Filesystems which supplements some of the information in Chapters 11 and 12 of the book.
Note: Simon St. Laurent is also author of Dynamic HTML: A Primer.
[CR: 19990712]
St. Laurent, Simon; Biggar, Robert. Inside XML DTDs: Scientific and Technical. New York, NY: McGraw-Hill, 1999. Extent: xii + 468 pages, CDROM. ISBN: 0-07-134621-X. Author's affiliation: [St. Laurent:] Writer and technical reviewer of computer books for IDG Books and McGraw-Hill publishing companies. WWW: http://www.simonstl.com/; [Biggar:] Professional programmer, PhD in physics..
"Although HTML got its start as a tool for distributing scientific papers, scientists, mathematicians, and other members of that original target audience have received fairly little from HTML's more recent development. The Extensible Markup Language (XML) and a number of key supporting standards promise to improve this situtation by giving scientists and technologists an even more powerful set of tools, however. XML allows the creation and standardization of domain-specific vocabularies (described in Document Type Definitions, or DTDs), making it easy to develop precisely-defined shared standards for exchanging information. Inside XML DTDs: Scientific and Technical provides a guide to XML with a sharp focus on scientific and technical applications of this new technology. In addition to XML itself, MathML, a core W3C standard that can be used in many fields, receives extended coverage. The second half of Inside XML DTDs: Scientific and Technical explores emerging XML standards and tools in a number of fields, including biology, chemistry, astronomy, library science, and meteorology. The conclusion explains what developers will need to do in order to create their own applications of XML, and provides a guide to integrating XML with current information architectures and practices."
[July 1999] Simon St.Laurent posted an announcement concerning the recent publication of Inside XML DTDs: Scientific and Technical. St.Laurent's book Inside XML DTDs: Scientific and Technical "provides a guide to XML with a sharp focus on scientific and technical applications of this new technology. In addition to XML itself, MathML, a core W3C standard that can be used in many fields, receives extended coverage. The second half of Inside XML DTDs: Scientific and Technical explores emerging XML standards and tools in a number of fields, including biology, chemistry, astronomy, library science, and meteorology. The conclusion explains what developers will need to do in order to create their own applications of XML, and provides a guide to integrating XML with current information architectures and practices."
See :http://www.simonstl.com/scitech/index.html.
[CR: 19990603]
St. Laurent, Simon; Cerami, Ethan. Building XML Applications. New York, NY: McGraw-Hill, [May] 1999. Extent: 512 pages, 150 illustrations. ISBN: 0-07-134116-1. Author's affiliation: [St. Laurent:] Writer and technical reviewer of computer books for IDG Books and McGraw-Hill publishing companies. WWW: http://www.simonstl.com/; Email: simonstl@simonstl.com; [Cerami:] New York University and Riptide Communications. WWW: http://cs.nyu.edu/ms_students/cera7013/index.html, Email: cerami@cs.nyu.edu.
"The book focuses on Java XML parsers, including Aelfred, SAX (Simple API for XML), and Microsoft MS-XML. Other topics include XML/database integration and dynamically generated XML via Java Servlets."
[Authors' description:] "XML promises to revolutionize the Web and the nature of distributed computing. XML holds enormous promise as the file format of choice for Web development, document interchange, and data interchange, and presents a new world of opportunities and challenges to programmers. What Java is doing for programming, XML may do for data. Combining the two, as is done throughout this book, makes it possible to build exciting (and useful!) applications and architectures. Building XML Applications provides developers with a solid introduction to XML and key programming tools for building robust, scalable XML applications in Java. After a thorough introduction to XML's place in the developer's toolkit and its syntax, Building XML Applications presents detailed coverage of parsers, a key tool for developers. Focusing on Java development, the sample applications use the Simple API for XML (SAX) to create parser-independent solutions that can fit in a wide variety of situations. Other XML tools, like style sheets, namespaces, linking, and the Document Object Model (DOM) are also explored, giving developers a friendly but approachable introduction to these revolutionary technologies." See the information page on St. Laurent's Web site; [local archive copy].
[July 26, 1999] [Simon says:] 'Minor updates to Building XML Applications.' "I've posted a new version of the prefs.java file from Chapter 20 of Building XML Applications that works with Technology Release 2 of Sun's ProjectX XML parsers. (The version in the book uses Early-Access 1.) This is a very simple class for managing preference files built with XML using the DOM. The constructor has changed slightly to accomodate changed methods for loading XML documents. Otherwise, it isn't a dramatic shift. Also, I've added pointers to some work I've done based on the examples in Chapter 19 that led to my XLinkFilter work. When a new draft of XLink appears, I'll be updating XLinkFilter and those examples yet again. These materials are available at: http://www.simonstl.com/buildxml/index.html#update
[CR: 1995]
Stabler, Hugh R. Experiences with High-Volume, High-Accuracy Document Capture. Rank Xerox Technical Report. Mitcheldean, United Kingdom: Rank Xerox , 1995. Extent: approximately 10 pages. Author's affiliation: Rank Xerox, Document Technology Centre, Mitcheldean, United Kingdom; Email: Hugh@dtc.rankxerox.co.uk.
Abstract: "Rank Xerox have implemented an in-house high-volume data capture operation enabling 100% accurate capture of patent documents as SGML-encoded text plus embedded images. We describe our experiences with setting up and running this operation over the last 4 years."
The document is available online in HTML format: http://www.dtc.rankxerox.co.uk/Hrs_pape.html; [mirror copy]. The paper was presented earlier as part of the International Association for Pattern Recognition Workshop on "Document Analysis Systems" in October 1994, in Kaiserslautern, Germany. For other information on the conversion of EPO documents into SGML format, see: Paul Brewin, "SGML and Patent Document Processing. WIPO standard ST.32."
[CR: 19971125]
Stadler, Thomas. "Publishers Wanted, Authors Needed! The New Information Age is Waiting for Your Works." Page(s) 115-118 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: STEP Stürtz Electronic Publishing GmbH, Germany; Email: ths@step.de.
Abstract: "The new paradigm of information objects has recently emerged that replaces the old one of documents. The new view on information concentrates on smaller bits of information which may be connected in different contexts and that are linked and webbed together under multiple perspectives."
"This paper focuses on the techniques and applications that are available already to produce and maintain information webs. We discuss the fact that many authors and publishers are writing books as they have been doing for the last 500 years. Partly it seems to us to be the publishers and authors turn now to redefine their methods, their products and their markets. What are the new opportunities, what abilities and skills are needed, and what are the problems in the shift to a new way of writing and publishing?
Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.
Stenerson, Jon. "A LATEX Style File Generator and Editor." TUGboat: The Communication of the TeX Users Group [Proceedings of the 1994 Annual Meeting] 15/3 (September 1994) 247-254. 7 references. Author affiliation: TCI Software Research, Las Cruces, New Mexico; email: Jon_Stenerson@tcisoft.com.
"This article presents a program that facilitates the creation of customized LATEX style files. The user provides a style specification and the style editor writes all the macros. Editing takes place in a graphical user interface composed of windows, menus, and dialog boxes. While the editor may be used in any LATEX environment, it is intended primarily for use with TCI Software Research's word processor Scientific Word. The current style editor runs under any Windows 3.1 system. The performance is acceptable on a 386-based machine and naturally improves on 486's and Pentiums. As Scientific Word is ported to other systems so will the style editor be ported."
[CR: 19971205]
Sterken, James. "<Q> & <A>: James Sterken." <TAG> 10/11 (November 1997) 7-8. ISSN: 1067-9197. Author's affiliation: President, ArborText.
The article provides the text of an interview with James Sterken, co-founder and current President of ArborText. ArborText was created in 1982. Sterken sketches the historical interests and activities of the company, its current endeavors, and its plans to support XML.
[CR: 19950716]
Stern, D. "SGML Documents: A Better System for Communicating Knowledge." Special Libraries 86/2 (Spring 1995) 117-124 (with 4 references). Author's affiliation: Science Library & Information Services, Yale University, New Haven, CT, USA.
"Abstract: The use of SGML (Standard Generalized Markup Language) based documents and databases can provide enhanced access and display capabilities when compared to the files and indexes now available through most local or remote databases. These options are increased tremendously due to the structured nature of the SGML files. This article outlines some of the basic features of SGML and discusses their implications when compared to the utilities of other document and database types. It also identifies the areas needing further development in order to allow these SGML knowledge information systems to improve researchers' searching, display and manipulation of electronically stored data. Particular emphasis is placed upon possible enhancements to the currently limited print display imitation of most current electronic journals."
See a related article by the same author "Expert Systems: HTML, the WWW, and the librarian," Computers in Libraries 15/4 (April 1995) 56-58.
[CR: 19951229 MD: 19980606]
Stinchfield, Don. Using Catalogs and MIME to Exchange SGML Documents. MIMESGML Working Group, INTERNET-DRAFT. Providence, RI: EBT and MIMESGML Working Group, IETF, December 1, 1995. Author's Affiliation: EBT, Inc. [Electronic Book Technologies, Inc.; One Richmond Square; Providence, RI 02906; (401) 421-9550 x280; Email: des@ebt.com.
"This draft proposes a standard for exchanging SGML documents over the World Wide Web using catalogs and MIME. This draft extends SGML Open's definition of catalogs [10] by adding to it new keywords and storage object identifier (SOI) types. The new keywords identify SGML document objects (such as document type declarations and document entities), non-SGML document objects (such as stylesheets), and management information (such as base URL, character encoding, and character repertoire). The new SOI types include URIs and MIME Content-IDs. This document also describes a new MIME content type called Application/SGML-Catalog which identifies a MIME body part as a catalog."
Available online: The latest [December 1995] working copy can be fetched in text format: ftp://ftp.ebt.com/pub/nv/mimesgml/catalog2.txt [mirror copy, December 1995], or in Postscript format: [mirror copy, December 1995]. Don Stinchfield says "...look to have a new version in mid-january [1996]."
Older version(s): ftp://ds.internic.net/internet-drafts/draft-ietf-mimesgml-exch-00.txt [or mirror copy]. Also in Postscript format: ftp://ds.internic.net/internet-drafts/draft-ietf-mimesgml-exch-00.ps [mirror copy].
See now: XML Media/MIME Types.
[CR: 19950828]
Strehlow, Richard A.; Tallant, Thomas O.; Mason, James D.; Kienlen, Philip L.; Barry, Karen T. "Use of SGML for Retrieval of Chemical Data." Pages 138-145 (with 8 references) in Proceedings of the Symposium on Computerized Chemical Data Standards: Databases, Data Interchange, and Information Systems. Symposium on Computerized Chemical Data Standards: Databases, Data Interchange, and Information Systems, Atlanta, GA, USA. ASTM, Committee E-49 on Computerization of Material and Chemical Property Data. Edited by R. Lysakowski and C. E. Gragg. ASTM [American Society for Testing and Materials] Special Technical Publication 1214. Philadelphia, PA: American Society for Testing and Materials, October 1994. ISBN: 0803118767. ISSN: 0066-0558. Authors' affiliation: [?] TERMCO, Inc, Knoxville, TN, USA.
"The encoding of information within a document using Standard Generalized Markup Language (SGML) permits a novel approach to direct retrieval of data from documents. Although SGML is designed primarily for electronic interchange of texts, its features have been found to be useful in the management of data contained within a document. Encoding can include scientific and technical information, as well as associated and ancillary data, management data, and other metadata. This paper describes and gives examples of the use of the technique with special reference to chemical data. Examples of tags used in documents are shown. Retrieval of contained information is conventionally done by means of searches to retrieve a set of documents that have a probability of containing the desired information. The method described here uses a radically different approach to the information retrieval problem."
[CR: 19971227]
Streich, Robert. "Documents Are Software. A Focus on Reuse." Pages 391-400 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Robert Streich]: Researcher and Project Engineer, Schlumberger Austin Research, 8311 N. FM 620, Austin, TX USA; Email: streich@slb.com.
Abstract: "There are many advantages to breaking up complete documents into small, relatively discreet chunks or 'text modules': multiple authors can more easily work on the same document, the text modules could be served up individually as part of an on-line help or performance support system, and the modules can be reused in other documents. But how can we reuse modules between different documents with some assurances that they fit the new context? How will we track the dependencies between modules? In short, how will we address the increased complexity of managing a library of text modules? In the spirit of reuse, this paper explores two fields of research in the software engineering community that might be able to provide some answers to these questions: module interconnection languages and faceted classification."
This paper was delivered as part of the "Expert" track in the SGML/XML '97 Conference.
Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).
[CR: 19970331]
Stribling, Dee; Hunter, Tim; Olszewski, Len; Corrigan, Anne; Mullis, Randy; Allen, Lloyd. "A Real World Conversion to SGML." Pages 75-86 in Conference Proceedings, SIGDOC '96. The 14th Annual International Conference on Computer Documentation. ["Marshalling New Technological Forces: Building a Corporate, Academic, and User-Oriented Triangle"]. ISGDOC '96: 14th Annual International Conference. Research Triangle Park, North Carolina, US. October 20-23, 1996. Sponsored by the Association for Computing Machinery Special Interest Group on Documentation (SIGDOC). New York, NY: Association for Computing Machinery, 1996. ISBN: 0-89-791-799-5. Authors' affiliation: Publications Division, SAS Institute Inc., SAS Campus Drive, Cary, NC, 27513-2414 USA; Email: sasdes@unx.sas.com.
Abstract: In 1994, our Publications Division at the SAS Institute began converting our in-house publishing system. The conversion involved evaluating, selecting and implementing a new publishing system that would take advantage of the SGML paradigm for content markup. Components of the system include an SGML-based editor, routines for one-time conversions of legacy text to SGML, filters for dynamic conversions of SGML text and of graphics to various output formats, a document management system, and customizations that tailor third-party components to fit our environment. Along with new tools, we had to implement the new processes which we designed as we analyzed our documents and workflow for the new system. This paper explores our experiences from the time we began deciding to implement a new publishing system to now, when we have successfully implemented a significant portion of the new SGML-based system with working tools and prototyped processes."
Several other articles in this proceedings volume are germane to SGML: Tom Banfalvi, et al., "Manufacturing Documentation in the Virtual Warehouse"; Betsy Brown, et al., "From Hardcopy to Online: Changes to the Editor's Role and Processes"; Paul Beam and Peter Goldsworthy, "Technical Writing on the Web-Distributed SGML-Based Learning"; Stephanie Copp, "Working with Academe"; Cindy Roposh, et al., "Developing Single-Source Documentation for Multiple Formats"; Paul Prescod, "Multiple Media Publishing in SGML"; Lin-Ju Yeh, et al., "SSQL: a Semi-Structured Query Language for SGML Document Retrievals".
[CR: 19970518]
Sullivan, Eamonn. "Designing Web Sites for Non-Human Audiences." PC Week 14/17 (April 28, 1997) 38-.
Abstract: "Web pages can be used not only as a direct end-user interface but to link one application with another. Future Web sites will be browsed by intelligent software agents, which provide automatic information retrieval, as much or more as by human beings. Such electronic conduits are sensible when there is a lot of information to retrieve or it changes frequently because Web pages can be generated on the fly and impose few compatibility issues. The inherent limitations of HTML, which can only represent certain types of data, are problematic, and overcoming the fact that HTML focuses almost exclusively on visual information is the focus of numerous development efforts. There are already several products that bring sophisticated parsing engines to the Web and can find and automatically recognize data in fast-changing pages. The upcoming Extensible Markup Language (XML) standard lets content providers make their intentions far more explicit."
[CR: 19970828]
Sullivan, Eamonn. "Developing a Card Catalog for the Expansive Web [Intranet Builder. Intersights]." PC Week 14/36 (August 25 1997) 34. ISSN: 0740-1604. Author's affiliation: [PC Week Staff].
"The emergence of XML in a more or less solid form earlier this year has provided a more comprehensive framework for metadata, prompting several organizations to propose solutions based on XML. The main proposals have been XML-Data from Microsoft which is available at www.microsoft.com/standards/xml/xmldata.htm) and MCF (Meta Content Format) from Netscape (available at www.w3.org/TR/NOTE-MCF-XML/). Both proposals provide for a sophisticated method to describe the structure of information, such as properties about authorship and relationships between objects. This week [August 25, 1997], a working group under the auspices of the W3C organization will meet in Redmond, Wash., to begin hammering out a specification that will take the best parts of XML-Data, MCF and PICS. The resulting RDF [Resource Description Framework] specification, if used widely, will enable more efficient searches and exchanges of information between organizations." [Extract]
See more on the Resource Description Framework in the dedicated section. The article is available online: http://www.zdnet.com/pcweek/opinion/0825/25isigh.html; [archive copy].
[CR: 19970518]
Sullivan, Eamonn. "XML Will Take the Web to the Next Level. Labs Explore Enabling Technologies of Next-generation Markup Language." PC Week 14/17 (April 28, 1997) (pages: ).
Summary: "Many companies have jumped wholeheartedly into the Web, only to find that deploying a large Web site is as complex as developing a large application--and that HTML is not up to the task. It's akin to trying to develop an operating system in BASIC. The Extensible Markup Language, or XML, is the World Wide Web Consortium's answer to the limitations of HTML. It is an extremely flexible language that will enable organizations to deploy more sophisticated documents and exchange complex data via the Web. The XML specification was released at the Sixth International World Wide Web Conference in Santa Clara, Calif., earlier this month (see the story). Several software vendors, including Microsoft Corporation and Netscape Communications Corporation, have already endorsed it."
The article is available online in HTML format from ZDNET: see http://www8.zdnet.com/pcweek/reviews/0428/28xml.html; archive copy, text only.
[CR: 19960826]
Sullow, Klaus. "[AMPHORE-a movie documentation workbench] (Article in German)." Nachrichten für Dokumentation 47/2 (March-April 1996) 67-74 (with 13 references).
"Abstract: AMPHORE is a client server system for the documentation of moving image material. The server mainly is formed by a full text database with SGML capabilities while the clients are PC workstations equipped with software for documentation and retrieval of movies and/or movie parts. In AMPHORE, the complete film material is provided in digital form and thus can be used for content-oriented documentation and retrieval in a convenient way. This enables the documentor to build very detailed indexes allowing access by sequence or even by shot. The film descriptions are based upon a syntactical, thesaurus-controlled indexing which reflects the films' diverse action strings and levels."
See: GMD - IPSI, Darmstadt, Germany (SGML & Digitales Video in der Medienarchivierung); Email: suellow@darmstadt.gmd.de. Compare: "Hypermedia Browsing and the Online-Publishing Process", Proceedings of DAGS 95, online.
Sutton, Brett (editor). Literary Texts in an Electronic Age: Scholarly Implications and Library Services. A Collection of the Papers Presented at the 1994 Clinic on Library Applications of Data Processing at the Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Clinic on Library Applications of Data Processing, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, April 10-12, 1994. University of Illinois, Urbana-Champaign: The Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, 1994. ISBN: 0-87845096-3. ISSN: 0069-4789.
A number of articles in this collection address the use of SGML for information structuring within the library science and wider academic community. See, for example, papers by Susan Hockey, C. Michael Sperberg-McQueen, John Price-Wilkin, Mark Day, and Rebecca Guenther. Publisher's address: Publications Office, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, 501 E. Daniel Street, Champaign, IL 61820; FAX: 217.244.7329; Tel: 217.333.5218.
[CR: 19971227]
Svenberg, Stefan. "Intention-Based Input Specifications for Automated Document Generation." Pages 417-426 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Stefan Svenberg]: ABB Corporate Research, Department R, Västerås 722 22 Sweden; Email: stsv@crckl8.secrc.abb.se; Phone: +46 21 323247; FAX: +46 21 142190, 323090.
Abstract: "We explore a new structure of input specifications for document generators based on the micro-document approach. The structure is based on the intentional properties of texts. We focus on the writers' intentions and readers' need to be informed, besides the actual content of the document. The generator processes the specification, and decides on the appropriate actions needed to create a document in accordance to the plan. The intentional properties can be marked up using SGML. Some examples are provided."
"[Conclusion]: We believe that the main benefits of using the intentional approach for document structuring in generation, consist in giving an increased awareness of the underlying nature of documentation. In any authoring activity, these matters are very important. If you are careless you will not get the message across, and the documentation will not be used. We have also made a point about distinguishing generic information from product specific information. It allows for a generalization of the generation problem and better opportunities for re-use."
This paper was delivered as part of the "Expert" track in the SGML/XML '97 Conference.
Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).
[CR: 19961226]
Swank, Renée. "Case Study: Maintaining and Developing a Dynamic SGML Environment at Ericsson." Pages 619-622 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Applications Engineer, Isogen Internation Corp., 2200 N. Lamar Suite 230, Dallas, Texas 75202.
In 1991, Ericsson Inc. began implementing Standard Generalized Markup Language (SGML) in their Customer Documentation Department in Richardson, Texas. An SGML working environment for procedural documentation was created first. The second SGML working environment was developed internally for descriptive documents and was based on the first. A user's guide working environment was developed in 1994 which was different than anything done in the past. A system was also put in place for maintaining these SGML environments. Customer Documentation's SGML expertise has enabled it to be in the forefront for SGML implementation in other company groups and also to sell its services in SGML document production."
Document available online from the ISOGEN server: "Case Study: Maintaining and Developing a Dynamic SGML Environment at Ericsson", SGML '96 presentation by Renée Swank.
Note: The above presentation was part of the "SGML Case Studies" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
[CR: 19980205]
Swank, Renee; Pratt, Don. "Delivering Documentation to Customers in SGML: How It Works in the Telecommunications Industry." Pages [not abstracted] in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Authors' affiliation: [Swank]: ISOGEN International Corporation; [Pratt]: Bellcore.
Abstract: "Many companies are required to deliver documentation to customers electronically. As a significant step in solving Electronic Document Delivery (EDD) issues, the telecommunications industry has developed an interchange DTD and a packaging guideline that provide a common 'language' for expressing document content and logical structure. Documents created on any system may be translated to this 'language' by document producers, and from this 'language' to any display or production system by document recipients. Although the interchange DTD and packaging guideline were designed by telecommunications industry, they are general enough to be directly used or slightly modified to meet EDD requirements in other industries as well."
Other information on TIM (Telecommunications [or Technical] Interchange Markup) and TEDD (Telecommunications Electronic Document Delivery Package Guideline) is available in the main database entry: TCIF/IPI (Telecommunications Industry Forum Information Products Interchange).
This presentation was delivered as part of the "Introductory Tutorials" track in the SGML/XML '97 Conference. The extended description is available online: "Delivering Documentation to Customers in SGML: How It Works in the Telecommunications Industry." By Renee Swank (ISOGEN) and Don Pratt (Bellcore); [local archive copy].
Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).
Swiss Federal Institutes of Technology [Lausanne and Zürich]. International Conference on Research and Trends in Document Preparation Systems. Abstracts of the Presented Papers. Conference on Research and Trends in Document Preparation Systems, Lausanne, Switzerland, February 27-28, 1981. Supported and organized by the [Swiss] Conseil des Ecoles Polytechniques Fédérales. J. D. Nicoud, Program Chair. Lausanne/Zürich: Swiss Federal Institutes of Technology, 1981. v + 130 pages.
This 1981 conference sponsored by EPFL, together with the ACM SIGPLAN/SIGOA Symposium on Text Manipulation, was one the early influential conferences bringing together advocates of descriptive markup principles creating a broader forum for discussion of the fundamental insights of formal markup languages. See, in this Lausanne Conference volume, important articles by Brian K. Reid and by Charles F. Goldfarb.
[CR: 19960202]
Szillat, Horst. "SGML and LaTeX." Baskerville [The Annals of the UK TEX Users' Group] 5/2 (March 1995) . ISSN: 1354-5930. Author's affiliation: Email: szillat@berlin.snafu.de.
This issue of Baskerville makes available a number of papers presented at a joint meeting of the UK TEX Users' Group and BCS Electronic Publishing Specialist Group (January 19, 1995) [mirror copy]. See the link to Baskerville, or email: baskerville@tex.ac.uk. Issue 5/2 of Baskerville has other articles on SGML: "Portable Documents: Why use SGML?" (David Barron); "Formatting SGML Documents" (Jonathan Fine); "HTML & TeX: Making them sweat" (Peter Flynn); "The Inside Story of Life at Wiley with SGML, LaTeX and Acrobat" (Geeti Granger); "SGML and LaTeX" (Horst Szillat). See the special bibliography page for other articles on SGML and (LA)TEX.
Szillat, Horst. SGML - Eine praktische Einführung. Bonn, Germany: International Thomson Publishing GmbH, 1995. 226 pages. ISBN: 3-929821-75-3. Author's address: szillat@berlin.snafu.de.
Abstract [supplied by the author] [English] This German SGML-book gives an introduction to SGML. The material is discussed by examples. In the second part of the book the author explains his ideas of what is formatting of a SGML-document and shows that these ideas can be realized by LaTeX. [German] Dieses SGML-Buch gibt eine Einführung in SGML. Das Material wird an Hand von Beispielen diskutiert. Im zweiten Teil des Buches erklärt der Autor seine Idee, was Formatierung eines SGML-Dokumentes bedeutet und zeigt, daß diese Ideen mit LaTeX relisiert werden können.
Further description of the book is available on the following URL: Horst Szillat: Mein SGML-Buch. Email: szillat@berlin.snafu.de. Home Page: http://www.snafu.de/~szillat/.
[CR: 19980203]
[<TAG> Staff Writer]. "(SGML | XML!) at Slash '97. GCA Holds its Annual SGML Event." <TAG>: The SGML Newsletter 11/1 (January 1998) 4-7. ISSN: 1067-9197.
This article provides a summary of vendor news and other initiatives from the SGML/XML '97 Conference, "SGML is Alive, Growing, Evolving!" (December 7 - 12, 1997, The Washington Sheraton, Washington, D.C.). Brief product updates or news summaries are given for Adobe (XML in its FrameMaker product line); Microstar (XML support in Near & Far Designer 3.0); Poet Software's SGML/XML Repository; Microsoft and Xmlu.com (XML Xposed); Progressive Information Technologies (Target 2000); Enigma (Insight 4.0); AIS/Balise (new XML support in Balise; and the Balise HTML package); International Language Engineering (OpenTag version 1.0); OmniMark (Banff Internet application server tools).
[CR: 19961226]
Takahashi, Toru; Higashino, Jun'ichi; Hoshi, Yukio. "And Yet Another Approach for SGML Translation." Pages 381-388 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Authors' affiliation: [Takahashi]: Senior Researcher, Hitachi, Ltd., Information Systems R&D Division; Email: t-takaha@isrd.hitachi.co.jp; [Higashino:] Hitachi, Ltd., Information Systems R&D Division; [Hoshi]: Hitachi, Ltd., Software Development Center.
Abstract: "In realizing an SGML-based document processing system, it is required to transform the document structure and/or the data representation, from a source document written in SGML, to data in the format required by the application. In real-world, there is a problem that this transformation often becomes very complex. To solve this problem of complexity, we designed a programming language for SGML transformation (down translation) and implemented its processor. (This language is currently called "Æsop.")
The Æsop processor works on a parsed tree structure (ESIS structure), which is the output of an SGML parser. The processor automatically traverses the ESIS tree structure in depth-first order, selects and executes a script for each node.
To realize the complex transformation with a simple and straightforward program, we designed Æsop as a language which has following features: (1) Ability to select a script for a node, according to any complex condition satisfied by the node. (2) A rich set of built-in functions which enables to modify the document structure itself. (3) Ability to construct a 'process pipeline.' A 'process' is a set of scripts applied to the document tree structure through one traversal action. With Æsop, programmers can divide a complex transformation program to a series of simple processes. A typical Æsop program consists of one or more tree conversion processes and one data output process.
With a prototype processor of Æsop, we succeeded to transform a complex SGML document (written according to a DTD which is very similar to the ISO/IEC TR 9573-11 DTD) to LaTeX. Through this work, we had confirmed the effectiveness of Æsop for transformation from SGML documents containing complex math expressions and tables."
Note: The above presentation was part of the "SGML Expert" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
[CR: 19950804]
Tauber, James K. "Abandon all hope, ye who enter. A TEI novice recounts his experiences marking up [Dante's] La Divina Commedia and the [UBS] Greek New Testament." Electronic Texts and the Text Encoding Initiative [Special Issue] = TEXT Technology: The Journal of Computer Text Processing 5/3 (Autumn, 1995) 225-234. ISSN: 1053-900X. Author's affiliation: Centre for Linguistics, University of Western Australia.
See the main entry for this special issue of TEXT Technology dedicated to the TEI, edited by Lou Burnard.
[CR: 19961111]
Taylor, Conrad. "What Has WYSIWYG Done to Us? [WYSIWYG Desktop Publishing Has Duped Us." The Seybold Report on Publishing Systems 26/2 (September 30, 1996) [1], 3-12. ISSN: 0736-7260. Author's affiliation: Information Design Association, email: conrad@ideograf.demon.co.uk.
Abstract: "I argue that vendors of desktop publishing software are selling us short on quality typography; we have been duped by the mere illusion of typographic control. . . This is a paper which I wrote to support a lecture given at a conference in February 1996. It points out that WYSIWYG was only one of five approaches to computerised typesetting under development in the 1980's, but has come to dominate the world of typesetting today. But is it perhaps time to re-examine the virtues of TEX (with its superior H&J algorithms) and SGML (with its ability to carry generic mark-up into different environments)? What would this mean for divisions of labour and responsibility in typesetting? And is there any way of getting the vendors of DTP software to improve the typography and H&J algorithms of their products?" The author also concludes (among other things): "Generic markup needs a comeback." [SGML is discussed along the way]
Online version: http://www.datatext.co.uk/ideography/library/seybold/WYSIWYG.html, [mirror copy, text only version]. Or: the PDF version of the document. The French TeX user group GUT (Groupe des Utilisateurs de TeX) also intends to translate it for publication in Les Cahiers GUTenberg. See also Conrad Taylor's Ideography page for related documents.
[CR: 19971018]
Tetreault, Ronald. "Electrifying Wordsworth -- A Progress Report." Pages 164 - 167 in ACH-ALLC '97. The 1997 Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing. Conference Abstracts. ACH-ALLC '97. Queen's University at Kingston, Ontario, Canada. June 3 - 7, 1997. Compiled by Greg Lessard and Michael Levison. Ontario, Canada: Queen's University, 1997. ISBN: 0-88911-760-8. Author's affiliation: Dalhousie University, Email: tetro@is.dal.ca.
[Extract:] ". . . our copy-texts will be taken from the original editions themselves as held in libraries around the world, though of course our procedures will be informed by the findings of previous scholars, especially the editors of the Cornell Wordsworth series. Fourth, our e-texts will be "marked-up" or tagged using SGML (Standard Generalized Markup Language) in conformity with the principles of the Text Encoding Initiative (TEI). Fifth, we plan to link our transcribed e-texts to scanned images of the original printed editions in order to give the reader some sense of the look of the poems upon the page. Finally, this scholarly hypertext edition will be issued on CD-ROM in the first instance, with the intention of proceeding to network distribution as soon as it becomes practical."
Full abstract available online in HTML format: "Electrifying Wordsworth -- A Progress Report", by Ronald Tetreault; [archive copy]. See also a related description, "Electrifying Wordsworth".
Additional information on the ACH-ALLC '97 Conference is available in the SGML/XML Web Page main conference entry, or [August 1997] via the Queen's University WWW server.
[CR: 19971227 MD: 19971229]
Thompson, Henry S. "Element Type Hierarchies for Transparent Document Structure Definition." Pages 341-343 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Henry S. Thompson]: University of Edinburgh, HCRC Language Technology Group, 2 Buccleuch Place, Edinburgh EH8 9LW Scotland; Email: ht@cogsci.ed.ac.uk; WWW: http://www.ltg.ed.ac.uk/~ht/ .
Abstract: "Two recent proposals for meta-applications of XML (XML-Data and MCF) have included DTD fragments for describing document structure, sometimes called 'schemata'. In this paper I describe the XML-Data schemata proposal, concentrating on the motivation for and nature of the provision of an element-type hierarchy, in which element types can inherit attribute declarations and positions in content models from ancestors in the hierarchy. I argue that this represents a major improvement over the use of parameter entities to structure and maintain DTDs."
"Complex document types require rich and complex structural markup. SGML provides powerful mechanisms for defining the grammar of such markup, with element type and attribute declarations in the document type definition (DTD). The structure of the DTD itself, however, finds no explicit expression in SGML. The fact that element types are related in a structured fashion can only be represented implicitly, e.g., through the use of parameter entities. There is a real need, for ease of understanding and ease of maintenance, to address this issue. [...] The only coherent development policy in my view is to introduce things into the schema DTD which we know how to translate into vanilla XML. Not only does this guarantee inter-operability in the limit, but the translation serves to define the semantics of each part of the schema DTD in a concrete and unequivocal way."
This paper was delivered as part of the "Expert" track in the SGML/XML '97 Conference.
Henry S. Thompson is co-author of a paper "proposing a number of extensions to the XML document type declaration model, called XML-Data. Apropos of which: an early draft version of this SGML/XML '97 paper is available online in HTML format: "Why I demand Schemata: Element Type Hierarchies for Transparent Document Structure Definition." Dated: Oct 15 1997. [local archive copy].
Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).
[CR: 19951220]
Thompson, Henry S.; Finch, Steve; McKelvie, David. The Normalized SGML Library (NSL). HCRC Technical Report, Ref. No. HCRC/TR-74. [LRE Project 62-050 Multext Workpackage 2 Milestone C D NSL: SGML Tools]. Edinburgh, Scotland: Human Communication Research Centre, November 14 1995. Extent: 38 pages, 2 references. Authors' affiliation: Human Communication Research Centre, University of Edinburgh, 2 Buccleuch Place, Edinburgh, Scotland. Email: eucorp@cogsci.ed.ac.uk.
Abstract: "This document describes the Normalised SGML Library (NSL), which consists of a set of C programs for manipulating SGML files and a C application program interface (API) designed to ease the writing of C programs which manipulate SGML documents."
Summary: "In pursuit of a development environment for SGML-based corpus and document processing, with support for multiple versions and multiple levels of annotation, LTG have developed an integrated set of SGML tools and a developers tool-kit, including a C-based API. This software described here contains everything required to process a very wide range of conformant SGML documents. Its initial parsing module incorporates v0.4 of James Clark's SP software, arguably the broadest coverage SGML parser available anywhere, commercial or not.
"The basic architecture is one in which an arbitrary SGML document is processed on the way in, as it were, yielding two results: 1) An optimised representation of the information contained in the document's DOCTYPE; 2) A normalised version of the document instance, which can be piped through any tools built using our API for augmentation, extraction, etc. The use of the cached DOCTYPE together with the normalisation of the SGML to nSGML means that applications processing nSGML streams can be very efficient.
"This document assumes that the reader is familiar with SGML [Goldfarb 90] and the C programming language [Kernighan 88]. The structure of this document is as follows. The next section introduces the NSL system. The third and fourth sections describe the user-callable utility programs provided in the NSL system. We then give an overview of the data structures used to represent SGML structure in the API, followed by an annotated example of the use of the NSL API in a complete program. In section 7 we give a description of the NSL query language which provides a convenient way of referring to elements of an SGML document, followed by an annotated program showing the use of the query language. The final three sections give a detailed description of nSGML and the data structures and functions defined in the NSL API." [from the document Introduction]
ftp://scott.cogsci.ed.ac.uk/pub/HCRC-papers/tr-74.ps.gz, or mirror copy, December 1995.
[CR: 19980430]
Thompson, Henry S.; McKelvie, David. "Hyperlink Semantics for Standoff Markup of Read-Only Documents." Page(s) 227-229 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: [Thompson]: Reader, Department of Artificial Intelligence and the Centre for Cognitive Science, Language Technology Group, University of Edinburgh, Scotland; Email: ht@cogsci.ed.ac.uk; WWW: http://www.cogsci.ed.ac.uk/~ht/ also, WWW: http://www.ltg.ed.ac.uk/software/; [David McKelvie]: Research Fellow, Language Technology Group, Human Communication Research Centre, University of Edinburgh, Scotland; Email: David.McKelvie@cogsci.ed.ac.uk; WWW: http://www.cogsci.ed.ac.uk/~dmck/.
Abstract: "There are at least three reasons why separating markup from the material marked up ('standoff annotation') may be an attractive proposition: 1) The base material may be read-only and/or very large, so copying it to introduce markup may be unacceptable; 2) The markup may involve multiple overlapping hierarchies; 3) Distribution of the base document may be controlled, but the markup is intended to be freely available.
"In this paper, two kinds of semantics for hyperlinks are addressed to facilitate this type of annotation, and describe the LT NSL toolset that supports these semantics. The two kinds of hyperlink semantics that are described are (a) inclusion, where one includes a sequence of SGML elements from the base file; and (b) replacement, where one provides a replacement for material in the base file, incorporating everything else. The speakers address the issue of different kinds of (HyTime and TEI) addressing schemes by means of SGML identifiers, URLs, and character offsets into non-SGML data. We also address the issues of indexing large files to improve the speed of accessing SGML elements in the base files."
A version of this document is available online in HTML format: http://www.ltg.ed.ac.uk/~ht/sgmleu97.html; [local archive copy]. Alternately, abstract in GCA-paper markup: http://www.ltg.hcrc.ed.ac.uk/~dmck/sgml-europe-97.html; [local archive copy]. On the use of a hierarchical database to model (non-) hierarchical structures, see SGML/XML and (Non-) Hierarchy."
Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.
[CR: 19961226]
Thompson, Marcy. "An Element is not a Tag - (and why you should care)." Pages 65-70 in SGML '96 Conference Proceedings. Celebrating a Decade of SGML. SGML '96 Conference, Boston, MA, November 18-21, 1996. Sponsored by The Graphic Communications Association (GCA). [Edited by] Conference Co-Chairs: B. Tommie Usdin and Deborah A. Lapeyre. Alexandria, VA: GCA, 1996. Extent: 711 pages. Author's affiliation: Passage Systems, Email: marcy@squirrel.com; WWW: http://www.squirrel.com.
Abstract: "Too many people say 'tag' when they mean 'element'. While this might seem to be just semantic quibbling, the difference is actually important. The power of SGML-based processing lies precisely in the fact that an element is more than a tag. By examining three systems that exploit the power of SGML to allow sophisticated actions on content, this talk shows that understanding an element as more than just the tags that delimit it is a critical part of exploiting the full power of SGML."
Note: The above presentation was part of the "SGML Newcomer" track at SGML '96. The SGML '96 Conference Proceedings volume containing the full text of the paper may be obtained from GCA.
[CR: 19971227]
Thompson, Marcy. "How to Make an Industry Standard DTD Work for You (without losing your mind, your marriage or your job)." Pages 71-76 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Marcy Thompson]: CRI Inc., 3245 146th Place SE Suite 270, Bellevue, WA 98007; Phone: +1 425 643-7443 x3027; Email: marcy@squirrel.com.
Abstract: "Implementing SGML is a big task, and one of the obstacles to be overcome is the development of an appropriate DTD or suite of DTDs. In many industries, there are high-profile 'industry standard' DTDs (developed by an industry consortium or a formalized standards activity) which hold out the promise of DTD nirvana: all gain with no pain. To what extent can an industry standard DTD help you achieve your implementation goals? What pitfalls must you avoid in order to prevent this nirvana from becoming just another failed SGML implementation?"
This paper was delivered as part of the "Newcomer" track in the SGML/XML '97 Conference.
Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).
[CR: 19971227]
Tidwell, Doug. "TaskGuides(tm): An XML-Based System for Creating Wizard-Style Helps." Pages 663-668 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Doug Tidwell]: Advisory Programmer, IBM Corporation E20D/500, P.O. Box 12195, Research Triangle Park, NC USA; Phone: 1+ (919) 254-5128; FAX: 1+ (919) 543-4118; Email: dtidwell@us.ibm.com.
Abstract: "IBM's TaskGuide technology gives Technical Writers and Human Factors professionals the ability to create wizards. Based on the premise that task analysis is the most difficult part of creating an effective wizard, our tools let you focus on design, not writing code.
"This paper discusses the basics of wizard technology, followed by a brief introduction to the XML-based system we have created. We cover some of the key design decisions we had to make, and introduce some of the unique features of our product. Finally, we demonstrate a recursive document, a wizard that creates another wizard."
This paper was delivered as part of the "Case Studies" track in the SGML/XML '97 Conference.
Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).
[CR: 19980924]
Tittel, Ed; Mikula, Norbert; Chandak, Ramesh. XML for Dummies. Foreword by Dan Connolly. [Series: For Dummies]. Foster City, CA: IDG Books Worldwide, Inc., 1998. Extent: xxviii + 367 pages, CDROM. ISBN: 0-7645-0360-X. Authors' affiliation: [Tittel]: Tivoli Systems, etittel@lanw.com; [Mikula]: Senior Software Engineer, Datachannel, Inc, norbert@datachannel.com; [Chandak]: rksoftware@worldnet.att.net.
Summary: "XML For Dummies takes you through a basic overview of XML -- its capabilities, syntax, and technologies -- before moving into useable information and step-by-step methods for designing, building, and using XML's extensible features. XML's special 'dialects' support advanced tools for using push technology, building dynamic interfaces, and managing or transmitting data across the Web. And freeware and trial software versions of XML software packages, tips for finding online XML resources, a cross-linked glossary, code examples from the book, and other cool features are included on the bonus CD-ROM that comes with this indispensable guidebook." [from the publisher]
A review of the book was published by Dianne Kennedy. ". . . Overall, I found XML for Dummies to be a good addition to my reference library. It clearly will have more value to those who are using HTML rather than SGML as their starting point. SGML folk will likely find many of the SGML-oriented discussions too simplistic. In addition, they may find Chapter 6, which is based on using XML schemas in place of DTDs, rather confusing. But the good discussion of how to read the XML specification and the excellent XML application DTDs makes this a book worth buying, no matter what your background is."
An overview of the book is presented on the LANWrights, Inc. Web site. See also the dedicated web site for the book, with detailed chapter summaries, URL collections, examples, and other resources.
[CR: 19971125]
Toche, Olivier; Melese, Bertrand. "Access to Cultural Heritage through an On-line Multimedia Data Service: Application to the Archive Folders of France's General Inventory of Monuments and Art Treasures." Page(s) 277-282 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: [Olivier Toche]: French Ministry of Culture, Heritage Management; WWW: http://aquarelle.inria.fr; [Bertrand Melese]: President and Founder, GRIF SA, France.
Abstract: "This document presents the European Aquarelle project and the missions and the documentation system of the General Inventory. It then examines one of the first applications of this research project with Aquarelle project and the missions and the documentation system of the General Inventory. It then examines one of the first applications of this research project with SGML tagging of a digital version of Inventory archive folders dealing with France's monuments and art treasures."
"The technical and documentary specifications and standards selected are TCP/IP for internal and external networks, HTML for pages of text, SGML (Standard Generalized Markup Language (ISO 8879) for digitised content folders and the Z39.50 request protocol for access to data bases, standards ISO 2788 and 5964 for drawing up monolingual and multilingual thesauri, and the CIMI (Consortium for the Computer Interchange of Museum Information) DTD and the Inventory DTD for applications respectively relating to museums/art galleries and monuments
Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.
[CR: 19971018]
Tompa, Frank. "Capitalizing on Text Structures. [Keynote Address]." Pages 170 - 171 in ACH-ALLC '97. The 1997 Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing. Conference Abstracts. ACH-ALLC '97. Queen's University at Kingston, Ontario, Canada. June 3 - 7, 1997. Compiled by Greg Lessard and Michael Levison. Ontario, Canada: Queen's University, 1997. ISBN: 0-88911-760-8. Author's affiliation: Department of Computer Science, University of Waterloo, Waterloo, Ontario; Email: fwtompa@uwaterloo.ca; WWW: Frank Wm. Tompa's Home Page.
[Extract:] "Scholarship increasingly depends on electronic document
repositories and the growth of digital libraries. As in physical libraries, the documents to be housed in scholarly collections include historical documents, literary works, reference texts, and government publications. Even more apparent in computer-readable form are collections of business documents (from annual reports and customer literature to procedures manuals and internal communications) and linguistic corpora (collections of spoken and written communication assembled to reflect the uses of language). Gray literature, including technical reports, personal communications, and online help information, also constitutes a growing text resource. SGML provides a method to describe the structure of a complex document in which components, layout, or other chosen features of the text are indicated through markup. The TEI Guidelines use SGML to define a set of comprehensive conventions for representing documents, and thus they establish a basis for scholarly communications. HTML defines another set of tags to delineate text structures. Beyond text representation, however, communications support also requires mechanisms for querying and manipulating structured documents."
Abstract available online in HTML format: "Capitalizing on Text Structures. [Keynote address]", by Frank Tompa; [archive copy]
Additional information on the ACH-ALLC '97 Conference is available in the SGML/XML Web Page main conference entry, or [August 1997] via the Queen's University WWW server.
[CR: 19951113]
Tompa, Frank Wm. Experiences with the OED. University of Waterloo Centre for the New OED and Text Research, Technical Report. Waterloo, Ontario: University of Waterloo Centre for the New OED and Text Research, 1991. Extent: 9 pages, 11 references. Author's affiliation: University of Waterloo, Waterloo, Ontario, Canada N2L 3G1; Email: fwtompa@uwaterloo.ca; Tel: (519) 888-4675; FAX: (519) 885-1208.
"Abstract: According to the Oxford English Dictionary, a dictionary can be either 'a book dealing with the individual words of a language...' or 'a repository of knowledge, convenient for consultation.' An effective dictionary database must serve both roles simultaneously; that is, it must be capable of answering precise questions about the written dictionary text as well as the language described by that text.
An effective representation for the OED has been based on the recent text structuring technique known as 'descriptive markup,' which introduces tags into a text stream. Thus, dictionary components are explicitly identified and delimited, so that, for example, an entry is marked by <E>...</E>, an etymology by <ET>...</ET> , a usage label by <LB>...</LB> , and a cited work by <W>...</W>.
The most visibly successful aspect of our research is embodied in the flexible and efficient search and display software. LECTOR (TM) is a general purpose browser that takes as input a stream of tagged text and formats it to the screen using typography to illustrate its structure. It uses a specially-designed formatting, or display-specification, language to accomplish this, through which the choice of typographical strategies is user-selectable. As a complementary software component, efficient retrieval is provided by the PAT (TM) text search engine. Each entry in the search index designates a 'semi-infinite' string that starts at a critical point in the text (e.g., at a word start) and continues uninterruptedly to the end of the text. Text regions (e.g., those representing individual dictionary components) can be specified to limit the scope of material being searched or displayed. Used together, PAT and LECTOR form a powerful query facility for text databases.
Examples drawn from our experiences with researchers and casual visitors illustrate the application of these tools to exploring the OED.
The document is available in Postscript format on the Internet: http://daisy.uwaterloo.ca/~fwtompa/.papers/hist.dict.ps [mirrored copy, November 1995]. The document was also (?) published under the title "An Overview of Waterloo's Database Software for the OED", as pages 123-143 in Proceedings of the Symposium on Historical Dictionary Databases and Data Retrieval Requirements, Toronto, October, 1991 [= CCH [Toronto Centre for Computing in the Humanities] Working Papers 2, 1992.
[CR: 19951113]
Tompa, Frank Wm.. "Not Just Another Database Project: Developments at UW [University of Waterloo Centre for the NOED]." Pages 82-89 in Reflections on the Future of Text. Proceedings of the Tenth Annual Conference of University of Waterloo Centre for the New OED and Text Research. University of Waterloo NOED Conference, Waterloo, Ontario,. October 20-21, 1994. Waterloo, Ontario: University of Waterloo Centre for the NOED and Text Research, 1994. Author's affiliation: University of Waterloo Centre for the New OED and Text Research, Ontario.
Available in Postscript format on the Internet: http://daisy.uwaterloo.ca/~fwtompa/.papers/oed94.ps, mirrored copy.
[CR: 19951110]
Tompa, Frank W. "What is (Tagged) Text?" [Volume] 2:81-93 in Dictionaries in the Electronic Age: Proceedings of the Fifth Annual Conference of the UW Centre for the New Oxford English Dictionary (St. Catherine's College, Oxford, 18-19 September 1989.) Waterloo, Ontario: UW Centre for the New OED, 1989.
"Abstract: In working on the New OED project, we, like many other researchers, have wrestled with large, intricate bodies of text. Based on this exposure, we have begun to investigate the similarities and differences between managing conventional business data and managing reference text data.
The paper begins with the claim that text can support complex models of the real world that cannot be captured more formally. Thus important information resources must be held as text, but the very absence of a formal model makes it difficult to identify the structures present in a text.
A common text structuring technique is descriptive markup, which introduces tags into a text stream. We present three views of tagged text: one based on tags as text, one on arbitrarily interleaved tags with text, and one on constrained tag placement in the text. Throughout the discussion, examples are drawn from our experience with the OED."
Available on the Intenet in Postscript format: [mirror copy]. For further details on the work of the Waterloo Centre for the New OED and Text Research, including SGML research, see extended overview for a publication by Gaston Gonnet.
[CR: 19951113]
Tompa, Frank Wm.; Raymond, Darrell R. "Database Design for a Dynamic Dictionary." Pages 257-272 (with 12 references) in Research in Humanities Computing I: Selected Papers from the 1989 ALLC/ACH Conference, Toronto. Association for Literary and Linguistic Computing, 16th International Conference; International Conference on Computers in the Humanities, 9th. Toronto, Ontario. June, 1989. Sponsored by ACH/ALLC. Guest edited by Ian Lancashire; Series editors: Susan Hockey and Nancy Ide. Oxford: Clarendon [Oxford University] Press, 1991. ISBN: . Author's affiliation: University of Waterloo Centre for the New OED and Text Research, Ontario.
The article supplies an overview of the NOED project in broad scope, including some discussion of SGML's limitations with respect to the data modeling goals of the UWaterloo researchers.
Available via the Internet in Postscript format: http://daisy.uwaterloo.ca/~fwtompa/.papers/dynamic.ps [mirrored copy, November 1995].
[CR: 19960312]
Travis, Deni C. "Marmalade [Tribute to Yuri Rubinsky]." <TAG> 9/2 (February 1996) 3. ISSN: 1067-9197.
This tribute is printed in a special issue of <TAG> dedicated to the memory of Yuri Rubinsky. See also the main eulogy collection.
[CR: 19951220]
Travis, Deni. "Rocky Montain SGML UG." <TAG> 8/12 (December 1995) 12. ISSN: 1067-9197.
Travis reports on the Rocky Mountain SGML Users' Group meeting, November 1995. Richard Pasewark (Adobe) gave a presentation on FrameMaker+SGML, and Eric Severson (Interleaf) presented a paper "How SGML and HTML Really Fit Together." Contact for the UG: Beth Hayes, bethh@lexisys.com.
[CR: 19960310]
Travis, Brian E. Activate OmniMark. Net-Virtual Location in Cyberspace [probably Denver, Colorado or Rochester, New York]: The SGML University Press, [forthcoming, second-quarter, 1996]. ISBN: 0-9649602-1-4. Author's affiliation: The SGML University.
Abstract: "This book, due second-quarter, 1996, is for the real-world SGML programmer who is using Exoterica's OmniMark translation utility. The author has been using OmniMark since it first came out. This is a collection of tips and techniques for using OmniMark in actual implementations to convert data into SGML, and to translate SGML documents into something else for delivery. A "quick-start" chapter is included to get you up-to-speed right away. Activate OmniMark includes up-to-date information on the newest version of OmniMark."
See further information via the SGML University Press WWW Page.
[CR: 19960325]
Travis, Brian E. "Cal Poly Offers University-level 'Electronic Publishing' Focus." <TAG>: The SGML Newsletter 9/3 (March 1996) 1, 5. ISSN: 1067-9197. Author's affiliation: The SGML University.
Note on a decision by Cal Poly to offer courses with a concentration in "Electronic Publishing and Imaging." The first course will be taught in July-August, 1996.
[CR: 19961029]
Travis, Brian E. "My Summer at Cal Poly [Editorial]." <TAG> 9/9 (September 1996) 1, 10. ISSN: 1067-9197. Author's affiliation: President, Information Architects Inc., and Managing Editor of <TAG>.
The author summarizes a summer teaching experience at Cal Poly, San Luis Obispo, California. The school has always had a strong program in printing (industry) arts, and is now developing a concentration in electronic publishing. Travis relates his story about the introduction of SGML into the training classes.
[CR: 19960828]
Travis, Brian E. "Classifiation of SGML Industry Professionals [Editorial]." <TAG> 9/8 (August 1996) 1, 4. ISSN: 1067-9197. Author's affiliation: President, Information Architects Inc.
The author notes that certification of professionals offering SGML services has made little progress, but some steps are being taken to form classification labels.
[CR: 19980508]
Travis, Brian E. "It's Conference Season." <TAG>: The SGML Newsletter 11/4 (April 1998) 4. ISSN: 1067-9197. Author's affiliation: President, Information Architects; Managing Editor, <TAG>.
The author provides an overview of several recent conferences on XML, SGML, and document management. Spring 1998: XML: The Conference 1998; Documation '98 West; and Seybold Seminars New York / Publishing '98, including the first XML Xposed conference. For the XML Xposed conference, some transcripts are available online; see the database entry.
[CR: 19960716]
Travis, Brian E. "Documation '96: SGML Hangs In There." <TAG>: The SGML Newsletter 9/4 (April 1996) 8-11. ISSN: 1067-9197. Authors' affiliation: President, Information Architects, Inc.
The author reports on the Documation '96 conference, with highlights on SGML developments. Featured in the summary are: XSoft (Astoria, object-oriented SGML database); Exoterica (OmniMark Version 3), Folio-InContext (The SGML Journal Publisher), and Texcel (Texcel Information Manager).
[CR: 19951208]
Travis, Brian E. "Don't Deliver SGML [Editorial]." <TAG>: The SGML Newsletter 8/11 (November 1995) 1, 8. ISSN: 1067-9197. Author's affiliation: President, Information Architects, Inc..
[The author concludes:] "The moral of this story is that you don't need to assume that, since your data is in SGML, you need to use an 'SGML-smart' delivery platform. Doing so will limit your choices, and could have an adverse impact on your data, your propriety, and even on the opperation of your company."
[CR: 19980612]
Travis, Brian E. "Don't Use .xml File Extensions." <TAG> The SGML Newsletter 11/5 (May 1998) 1, 12. ISSN: 1067-9197. Author's affiliation: President, Information Architects, and Managing Editor of <TAG>.
Having observed the increased frequency of Net files with the filename extension .xml, the author reminds <TAG> readers that XML is a metalanguage and not a language, and says: "using '.xml' as an extension doesn't tell us what kind of a file it is . . ."
See the database section XML Media/MIME Types for 'File extension(s): .xml' and related discussion.
[CR: 19971205]
Travis, Brian E. "Flux [Editorial]." <TAG> 10/11 (November 1997) 1, 6. ISSN: 1067-9197. Author's affiliation: President, Information Architects.
Reflections on XML, its rapid rise in popularity, and its relative instability - giving it a lot of promise in the midst of flus, but making it "dangerous" for a developer to tie a project to the emerging specification in terms of details.
[CR: 19950716]
Travis, Brian E. "HTML is Not SGML [Editorial]." <TAG> 8/6 (June 1995) 1, 6. ISSN: 1067-9197.
The article expresses skepticism about the effort to make HTML a substitute for SGML: "HTML is just another output format. The IETF needs to treat it as such and stop pretending it is SGML."
[CR: 19980719]
Travis, Brian E. "Latin Ergo SGML? [Editorial]" <TAG>: The SGML Newsletter 11/7 (July 1998) 1, 3. ISSN: 1067-9197. Author's affiliation: President, Information Architects.
Travis compares SGML to Latin, insofar as Latin, a "dead language," is also healthy as a legacy language. "To someone who asks me if they should use SGML or XML for their document management system, I find it difficult to recommend SGML, except in some very distinct cases: 1) They need to interface with someone else's SGML; 2) They need a particular tool that is not now XML-enabled; 3) Their company is already using SGML in another implementation. [. . . SGML ] is still a great technology for describing the structure of your information, and there are many companies that use SGML to do so. Just don't teach it to your kids."
[CR: 19980413]
Travis, Brian E. "Leaders and Followers [Editorial]." <TAG>: The SGML Newsletter 11/3 (March 1998) 1, 3. ISSN: 1067-9197. Author's affiliation: President, Information Architects.
The author comments on the "self-appointed keepers of the purity of SGML" in the context of broader discussion of XML's need for balance (academic and/versus commercial influence) and openness (not having the standards process "co-opted").
[CR: 19981007]
Travis, Brian . "The New <TAG>." <TAG>: The SGML Newsletter 11/9 (September 1998) 1, 3. ISSN: 1067-9197. Authors' affiliation: Architag.
As publisher and editor of <TAG>, Travis outlines a new plan for the newsletter publication, beginning in Fall 1998. Isssue 11/8 will not be published. New services will be offered to subscribers, and the newsletter will be available electronically. See the new URL, http://www.tagnewsletter.com.
[CR: 19970826]
Travis, Brian E. OmniMark At Work Volume 1: Getting Started. Englewood, CO: SGML University Press, 1997. Extent: xiii + 503 pages, CD-ROM disc. ISBN: 0-9649602-1-4. Author's affiliation: Information Architects; The SGML University.
Summary: "The book is targeted at programmers who are new to the language, and to experienced programmers who are new to version 3. OmniMark At Work starts with a chapter called "OmniMark for the Impatient". This chapter is designed for the person who needs to understand the concepts of OmniMark, and who wants to get a feeling for the language without spending time drudging through reference manuals to get started. This chapter has several programs that are highly documented, to show you what is happening at every step. There are plenty of tips throughout the book, and lots of code that you can integrate into your OmniMark programs. There are routines for converting RTF to SGML, translating SGML to HTML, examples of internal and external functions, transforming one SGML structure to another, creating "well-formed documents" for XML, and many more. . ." [from the author]
See the online description of the book from the the SGML University Press; [archive copy].
[CR: 19970207]
Travis, Brian E. "Predictions for 1997 [Editorial]." <TAG>: The SGML Newsletter 10/1 (December 1997) 1, 5-6. ISSN: 1067-9197. Authors' affiliation: President, Information Architects, Inc.
The author reviews predictions relating to SGML for 1996, and offers new predictions for 1997. For 1997: XML [slow acceptance], the Web, DSSSL ['high-profile applications'], HyTime [lethargic acceptance], and 'Niche Books'.
[CR: 19980203]
Travis, Brian E. "Predictions for 1998 [Editorial]." <TAG>: The SGML Newsletter 11/1 (January 1998) 1, 8-9. ISSN: 1067-9197. Author's affiliation: President, Information Architects Inc.; Managing Editor, <TAG>.
Brian Travis reviews his predictions for the calendar year 1997 (published in <TAG>), and offers further predictions for 1998. His top picks: XML big-time, XSL advances, SGML remains stable, name changes away from S-G-M-L, and more books on XML.
[CR: 19970824]
Travis, Brian E. "The Role of the Application in Book Production." <TAG>: The SGML Newsletter 10/8 (August 1997) 1-8. ISSN: 1067-9197. Authors' affiliation: President, Information Architects. Managing Editor, <TAG>. The article is an extract from the author's book OmniMark at Work, Volume 1: Getting Started [SGML University Press, 1997]. The article discusses the techniques "used to create, edit, and print the book [OmniMark at Work], along with some code samples from the book processing."
[CR: 19961113]
Travis, Brian E. "SGML Asia/Pacific '96." <TAG> 9/10 (October 1996) 10-12. ISSN: 1067-9197. Author's affiliation: Author's affiliation: President, Information Architects Inc.
A summary of the closing keynote address given by Brian Travis at the SGML Asia/Pacific '96 Conference, which attracted more than 160 participants. Observations on the important trends: SGML-based databases (Chrystal Software - Astoria, Texcel, XyVision, OmniMark); document management; virtual documents; mainstream SGML and W3C (XML) SGML; SGML and the Web.
See the conference entry for other information.
[CR: 19971229]
Travis, Brian E. "SGML (Alone) is Not the Answer." <TAG> 10/11 (November 1997) 1-6. ISSN: 1067-9197. Author's affiliation: President, Information Architects.
An earlier draft title "SGML is Not the Solution" gave way to the present title, developed along the following lines: "SGML works best when it is applied properly to certain solutions, along with other tools and technoilogies." Travis discusses the use of SGML in conjunction with databases. He provides references to case studies from the aircraft industry, legal publishing, and newspaper publishing.
Also presented at the SGML/XML '97 Conference.
[CR: 19971227 MD: 19971229]
Travis, Brian. "SGML (Alone) is Not the Solution." Pages 519-526 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Brian Travis]: President, Information Architects, Inc., 6989 S. Jordan Road, Suite 5, Englewood, CO 80112; Phone: +1 303-766-1336; FAX: +1 303-699-8331; Email: btravis@sgml.com; WWW: http://www.sgml.com.
Abstract: "SGML is a great technology. It has attracted the attention of some pretty influential companies, which have found that they can save money, get to market faster, and increase the accuracy of their documentation by using SGML.
"However, SGML by itself it not the answer. SGML can only work if it is part of an intelligent document management environment that utilizes other appropriate technologies.
"This talk is about the mixing of SGML and other technologies, like relational and object-oriented databases, internet and intranet servers, email, voice mail, and external protocol servers, and other new and old technologies. It ends with a methodology, called 'microdocument architectures', that can pull all of these technologies together to create an intelligent document management environment.
"You will leave this session with a better understanding of where SGML can fit, and where it might not necessarily the best solution. You will have the ammunition to convince your company that SGML should be part of an intelligent document management system and how you might go about integrating SGML with other technologies."
This paper was delivered as part of the "Business Management" track in the SGML/XML '97 Conference.
A version of this presentation is available in the November 1997 issue of <TAG>; see the bibliographic entry.
Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).
[CR: 19971227]
[Travis, Brian]. "SGML and the Desktop. SGML Tools on Low-end Publishing Systems Explored at Seybold Conference." <TAG> 6/5 (May 1993) 1, 4. ISSN: 1067-9197. Author's affiliation: President, Information Architects Inc.
The author supplies a report on the Seybold Publishing Conference held in April 1993, and particularly, on a special panel session chaired by Yuri Rubinsky (SoftQuad). The panel speakers addressed the role of SGML in desktop publishing systems.
[CR: 19961113]
Travis, Brian E. "SGML for the Masses? [Editorial]." <TAG> 9/10 (October 1996) 1, 13-14. ISSN: 1067-9197. Author's affiliation: President, Information Architects Inc.
The author addresses the question of whether SGML is "too hard" to implement, and the vendor-initiated concept of "Mainstream SGML." These vendors "do not want SGML reduced to a weekend project . . .[but] want to convince mainstream users that SGML is not really that difficult to adopt in an organization." The author also discusses the current XML effort sponsored by the W3C, and expresses some doubts about the viability of the endeavor: "Mainstream SGML and XML both address technical issues that have already been solved, and do nothing to enlighten the publishing community as to the real advantages of the SGML philosophy."
For "Mainstream SGML," see the Microstar WWW server, or Microstar White Paper ; [mirror copy], or a short short description of the effort to make it simple for authors to create and maintain SGML-savvy documents". For the XML activity, see the main XML entry in this database.
[CR: 19960716]
Travis, Brian E. "SGML and Metadata [Editorial]." <TAG>: The SGML Newsletter 9/6 (June 1996) 1, 6. ISSN: 1067-9197. Authors' affiliation: President of Information Architects, Inc..
The author muses on the important notion of "meta-data," by which he means "the ability to respect or ignore certain [SGML] elements based upon their descriptive markup."
[CR: 19970620]
Travis, Brian E. "SGML in the Pacific Rim." <TAG>: The SGML Newsletter 10/5 (May 1997) 14. ISSN: 1067-9197. Authors' affiliation: President of Information Architects, Inc.
Notes on a series of conferences in Sydney and Tokyo. Hot topics were CALS, XML, and multi-byte character set support in SGML software tools.
[CR: 19971106]
Travis, Brian E. "[Editorial] 'SGML: The Philosophy' Just Got Another Name." <TAG>: The SGML Newsletter 10/10 (October 1997) 1, 3. ISSN: 1067-9197. Authors' affiliation: .
The author reflects upon the rise of XML (as evidenced by the Seybold SF '97 Conference) and what it means for SGML.
[CR: 19960716]
Travis, Brian E. "SGML in the Summertime [Editorial]." <TAG>: The SGML Newsletter 9/6 (July 1996) 1, 12. ISSN: 1067-9197. Authors' affiliation: President of Information Architects, Inc.
A note offering some suggestions for appropriate "summertime" SGML projects.
[CR: 19960206]
Travis, Brian E. "SGML Predictions [1995 and 1996; editorial]." <TAG>: The SGML Newsletter 9/1 (January 1996) 1, 7-8. ISSN: 1067-9197. Authors' affiliation: Brian Travis is President of Information Architects, Inc, and Managing Editor of <TAG>.
Travis provides an overview of SGML's progress in 1995 and nominates several topics as "hot" for 1996: DSSSL implementations; increased use of SGML in Asia; specialized applications integrating SGML components into their import and export facilities; more books on SGML.
[CR: 19950716]
Travis, Brian E. "Tables in SGML. A proposal for intelligent handling of tabular data." <TAG> 6/6 (June 1993) 1-5. ISSN: 1067-9197.
Part I of a two-part article. Describes "how the SGML NOTATION function can be used to process tables".
[CR: 19950716]
Travis, Brian E. "Tables in SGML. A proposal for intelligent handling of tabular data, Part II." <TAG> 6/7 (July 1993) 1-5. ISSN: 1067-9197.
Part II of a two-part article. Describes how the theoretical model (Part I) would be implemented in a "real life" system.
[CR: 19970620]
Travis, Brian E. "Ten Years of <TAG>: The SGML Newsletter [Editorial]." <TAG>: The SGML Newsletter 10/5 (May 1997) 1, 3. ISSN: 1067-9197. Authors' affiliation: President of Information Architects, Inc.
Retrospects on ten years of publishing <TAG>: The SGML Newsletter, which was founded by Sharon Adler, William Davis, and Dale Waldt. The publishers now offer online copies of articles 18 months and older: http://tag.sgml.com/.
[CR: 19971230]
Travis, Brian E. "Use Care in Selecting Your Consultant [Editorial]." <TAG>: The SGML Newsletter 10/12 (December 1997) 1, 3-4. ISSN: 1067-9197. Authors' affiliation: President, Information Architects Inc.
Based upon personal experience and years of observation, the author constructs principles to guide clients in the process of selecting a consultant, or a consulting team.
[CR: 19950716]
Travis, Brian. "Using SGMLS and Awk as an Inexpensive Translator [SGML Tips & Techniques]." <TAG> 7/7 (July 1995) 10-11. ISSN: 1067-9197.
The author shows how a simple AWK script can be used to transfrom SGMLS output into a more useful notation. AWK, however, is slow and subject to line-length limitations.
[CR: 19960325]
Travis, Brian E. "The World's Cheapest SGML Database Management System." <TAG>: The SGML Newsletter 9/3 (March 1996) 1-5. ISSN: 1067-9197. Author's affiliation: The SGML University.
"Information Architects and SGML University proudly announce the availability of the World's Cheapest SGML Database Management System (TWCSDBMS). The product, version 1.0d1, is available for download now. This is a developmental release, and might not ever be updated from its current state. The download file is 3.4MBytes because of the overhead that Visual Basic requires. It runs on Windows 95 or Windows NT. The product is designed as a learning tool to be used to understand the nature of SGML database management. Because of its hierarchical nature, SGML requires a hierarchical database schema in order to express the relationships between the elements." [from the server> The tool is available at this URL: see the description."
The <TAG> article is available online in HTML format from the SGML University WWW server: http://www.sgml.com/tag/9030101.htm [mirror copy, text only].
[CR: 19970207]
Travis, Brian E. "XML: Evil or Necessary?" <TAG>: The SGML Newsletter 9/12 (December 1996) 1, 10. ISSN: 1067-9197. Authors' affiliation: President, Information Architects, Inc.
The author discusses the cautious misgivings expressed in an earlier editorial, and concludes that the XML (Extensible Markup Language) is a healthy necessity as part of the SGML revision process.
[CR: 19971014]
Travis, Brian E.. "XML: SGML Without the Installed Base [Editorial]." <TAG>: The SGML Newsletter 10/9 (September 1997) 1, 5-6. ISSN: 1067-9197. Author's affiliation: .
The author's editorial discusses the trade-offs he sees in the XML effort (vis-à-vis SGML), concluding that the two are not in competition. The HTML community (for whom XML is designed) may be seen as the "installed basd" for the new markup language.
[CR: 19950716]
Travis, Brian E; Travis, Deni C. "SGML Europe '95: Back to Gmunden." <TAG> 8/6 (June 1995) 1-5. ISSN: 1067-9197.
A report on the SGML Europe '95 conference held in Gmunden, Austria. Product announcements/demonstrations discussed are: Panorama Free, Synex Viewport, GRIF HyTime engine, EBT DynaText support for DSSSL, DRUID-Author, Stilo (SGML-smart editor), etc. See further the database entry for this conference.
[CR: 19980612]
Travis, Brian E.; Hahn, Michael. "HTML, SGML, PDF, XML: What's the Difference?" <TAG> The SGML Newsletter 11/5 (May 1998) 1-4. ISSN: 1067-9197. Authors' affiliation: [Travis]: President, Information Architects, and Managing Editor of <TAG>; [Hahn]: Senior Consultant, Information Architects.
An introduction to four related document computing technologies (HTML, SGML, PDF, XML), published in the form of a white paper. The goals, strengths, and weaknesses of each are presented.
Travis, Brian E.; Waldt, Dale C. "Case Study: How Our Book Was Produced." <TAG>: The SGML Newsletter 8/5 (May 1995) 1-5. ISSN: 1067-9197. Authors' affiliation: Brian Travis is President of Information Architects, Inc, and Managing Editor of <TAG>; Dale Waldt is the co-founder and publisher of <TAG>, and Data Development Manager with the Research Institute of America.
"This article is excerpted from The SGML Implementation Guide, by Brian Travis and Dale Waldt. The authors included this case study, among the other case studies in the book, as an example of the kind of document database that could be built using a small amount of money and a little knowledge about the tools that are available to the SGML implementor. The book was produced using the concepts discussed within the book, and this case study outlines some of the methods that were used."
See the full bibliographic entry for further details about the book. The book's table of contents and sample chapter are available online from the authors' WWW site, or (in part) via mirror copy here.
[CR: 19960312]
Travis, Brian; Waldt, Dale "In Memorium: Yuri Rubinsky, 1952-1996 [Remembering Yuri]." <TAG> 9/2 (February 1996) 1-2. ISSN: 1067-9197.
This tribute is printed in a special issue of <TAG> dedicated to the memory of Yuri Rubinsky. See also the main eulogy collection.
Travis, Brian E. "Using the Application to Render Tables." <TAG>: The SGML Newsletter 8/2 (February 1995) 6-9. ISSN: 1067-9197. Author affiliation: Brian Travis is President of Information Architects, Inc, and Managing Editor of <TAG>.
Travis, Brian E. "Tables, Tables, Tables." <TAG>: The SGML Newsletter 8/2 (February 1995) 1, 8. ISSN: 1067-9197. Author affiliation: Brian Travis is President of Information Architects, Inc, and Managing Editor of <TAG>.
Travis, Brian E. "The SGML Environment." <TAG>: The SGML Newsletter 8/1 (January 1995) 1-8. ISSN: 1067-9197. Author affiliation: Brian Travis is President of Information Architects, Inc, and Managing Editor of <TAG>.
Summary: "The SGML standard defines several pieces, each of which has a very well-defined purpose and structure. Some of these pieces are optional for an implementation, some are absloutely mandatory, and some are just nice to have. This article reviews the parts defined in the standard, and provides guidelines for selecting tools for your implementation."
[CR: 19980508]
Travis, Brian E. "XML: Enabling Technology or Silver Bullet? [Editorial]." <TAG>: The SGML Newsletter 11/4 (April 1998) 1, 3. ISSN: 1067-9197. Author's affiliation: President, Information Architects; Managing Editor, <TAG>.
The author tells an interesting story about some students who came to learn about XML at a recent Documation Conference (bringing with them certain assumptions about what XML was): they stayed for the first hour of a tutorial, which ended with an introduction to the DTD and its role in XML validity; some students disappeared during the break and didn`t return. This eposide, Travis says, "brings up an interesting issue, and shows that XML might have some of the same barriers to adoption that SGML has had. The main barrier is that darn DTD." The lesson, according to Travis: "Don't oversell. There has been a lot of hype about XML. Along with the hype comes the tendency to overpromise. . . Referring to XML as 'SGML without the tyranny of the DTD' can lead to problems for people who know what SGML is. For people who don't know what SGML. . . requires people to further investigate the real cost of implementing XML in their environment."
[CR: 19950716]
Travis, Brian E; Waldt, Dale; Laplante, Mary. "It's [SGML] Conference Season." <TAG>: The SGML Newsletter 8/4 (April 1995) 1-8, 11. ISSN: 1067-9197.
The multi-part article reports on highlights of three 1995 conferences in terms of SGML news: Documation '95, Folio's Infobase '95, and Seybold Boston.
[CR: 19960716]
Travis, Brian E; Waldt, Dale C. "SGML Europe '96." <TAG>: The SGML Newsletter 9/6 (June 1996) 9-11. ISSN: 1067-9197. Authors' affiliation: [Travis] President of Information Architects, Inc.; [Waldt] Vice President of Product Systems for the Research Institute of America Group.
A report on the SGML Europe '96 conference held in Munich, attended by some 700 people. High points, according to the authors: (a) the concept of "micro-documents", coined by John McFadden of Exoterica, in reference to "a small document that is not a fragment of a larger document or DTD, but rather is a self-contained unit of information that can be managed independently...in a database..."; Synex announced support for multi-byte Japanese characters; Exoterica revealed more about OmniMark Version 3 (NOX); InContext and Folio demonstrated SGML Journal Publisher; SoftQuad demonstrated HoTMetaL 3.0, which has several new features.
Travis, Brian E. "The SGML Implementation Guide Released [Editorial]." <TAG>: The SGML Newsletter 8/5 (May 1995) 1, 10. ISSN: 1067-9197.
The note discusses the purpose of the book review (The SGML Implementation Guide) presented in <TAG> 8/5; see the bibliographic entry.
[CR: 19980229]
Travis, Brian E. "Why Do We Need SGML? [Editorial]" <TAG> The SGML Newsletter 11/2 (February 1998) 1, 3-4. ISSN: 1067-9197. Author's affiliation: President, Information Architects Inc., and Managing Editor of <TAG>.
Brian Travis discusses the "XML" and "SGML" names in terms of politics and marketplace, referencing companies that are warm/cool to the name "SGML" at the present time.
[CR: 19951015]
Travis, Brian E.; Waldt, Dale C. The SGML Implementation Guide: A Blueprint for SGML Migration Berlin/New York: Springer-Verlag, 1995. Extent: Approximately 350 pages. ISBN: 0-387-57730-0; 3-540-57730-0.
Author's abstract: This is the book the authors needed when they were first implementing SGML. At that time, and up until now, there has not been a complete source of information for the SGML implementor. We had to perform major research at every single phase of our implementation process using time-honored systems analysis techniques. While his approach worked, we would have gladly embraced any help we could have found.
The philosophy behind this book is to provide a pragmatic working knowledge of SGML and related disciplines and techniques needed to actually achieve a successful implementation.
The book is not a review of products, but it does contain mention of some products as an example of what is available. It is not an executive briefing offering a high-level view of the advantages of implementing a structured approach to data, nor is it a nuts-and-bolts description of how to write SGML applications. Rather, it strikes a ground between those two extremes, offering to the people who must make the decision to implement, then the implementors, enough information to get well down the road to SGML.
See the [provisional] Table of Contents for further overview, and more authoritatively in an updated announcement. The full outline and sample chapters (Chapter 1 and Appendix 6) are accessible via the Web: point your HTTP client at http://www.sgml.com/SGMLImplementationGuide/. Further, see two articles introducing The SGML Implementation Guide: the book as a case study and an editorial note in <TAG>. Dianne Kennedy reviewed the book in <TAG> 8/10 (October 1995) 5-6.
[CR: 19950804]
Triggs, Jeffery. "Varieties of electronic experience: what should an electronic text be like?" Electronic Texts and the Text Encoding Initiative [Special Issue] = TEXT Technology: The Journal of Computer Text Processing 5/3 (Autumn, 1995) 179-189. ISSN: 1053-900X. Author's affiliation: Director of the North American Reading Program, Oxford English Dictionary.
"Triggs shows by a series of examples how easily an electronic text can fall short of realizing the full potential offered by its new medium. He calls to our attention the ways in which preconceived ideas of electronic text as a substitute for printed page can obstruct the goal of multi-purpose plasticity which so attracted us to texts in electronic form in the first place. He also warns us of the dangers of locking away the results of our hard editorial endeavours within a proprietary format, thus limiting its use to particular software systems." [from the issue Introduction, by Lou Burnard]
See the main entry for this special issue of TEXT Technology dedicated to the TEI, edited by Lou Burnard.
[CR: 19950716]
Tritt, Graham. "Starting a[n SGML] User Group." SGML Users' Group Newsletter 29 (November 1994) 5-6. ISSN: 0952-8008. Author's affiliation: Swiss Federal Office of Information Technology; Information and Documentation Center, Steigerhubelstr. 3, 3003 Berne, Switzerland. Tel:+ 41 31 325 9836 fax: -9767. Email: Graham.Tritt@ste3.bfi.admin.ch.
Based upon experiences with the Swiss SGML Users' Group since 1989, Tritt offers advice to others who may wish to learn from the group's accumulated wisdom.
Tucker, Hugh A.; Bogh, Torkil. SGML & ODA. Standards for Document Processing and Interchange. DS/INF 14, [1989]. Dansk Standardiseringsrad, 1989.
The publication is a book form of the technical report SGML/ODA: Standards for Document Processing and Interchange. See a summary and review of the work in "New Book on SGML and ODA Published. SGML & ODA. Standards for Document Processing and Interchange. DS/INF 14, 1989," <TAG> 12 (December 1989) 17-18.
[CR: 19971227 19971230]
Tucker, Hugh; Harvey, Betty. "SGML Documentation Objects within the STEP Environment." Pages 205-211 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Hugh Tucker]: Documenta, ApS, Hellerup, Denmark; Phone: +45 39 46 19 05; FAX: +45 39 46 19 08; Email: hugh@documenta.dk; [Betty Harvey]: Electronic Commerce Connection, Inc., Germantown, Maryland USA 20874; Phone: +1 (301) 540-8251; FAX: +1 (301) 540-4268; Email: harvey@eccnet.com; WWW: http://www.eccnet.com.
Abstract: "ISO 10303, Standard Exchange for Product Data (STEP), is being developed by a broad range of industries to provide extensive support for modelling, automated storage schema generation, life-cycle support, plus many more data management facilities. ISO 8879, Standard Generalized Markup Language (SGML), and the SGML family of standards, including HyTime and DSSSL, is used for modelling and encoding the documentation of industrial products, many of which are produced using STEP.
"There are technical differences between the STEP and SGML as well as differences in their application and spheres of enterprise. For example, STEP is used during the early stages of product development, e.g., design, testing, whereas SGML is more commonly applied during the latter processes of a product's life cycle.
"This paper discusses the technical differences and problems between the two technologies and outlines some of the identified requirements needed to harmonize the two types of data. An approach based on information objects is presented showing how SGML product documentation information can be incorporated and stored together with STEP information. Using an information object methodology could allow textual data such as designer's and testing notes, method annotations, comments, etc. produced during the beginning of the product development cycle to be associated and archived with the actual design models.
"The definition of an information object is discussed and the distinction is drawn between a perceptual documentation object type and the conceptual information object type needed in modelling STEP data. Implementation suggestions are made along with the practical requirements needed to make information objects effective and useful.
"The STEP standard task group, Product Documentation (ISO 184/SC4/WG3/T14) is currently tasked with the responsibility for creating a methodology for the cooperation of the STEP and SGML standards. Information will be provided about how current corporate initiatives could impact and provide pertinent input in the T14 Working Group."
This paper was delivered as part of the "User" track in the SGML/XML '97 Conference.
For more information on STEP, see the dedicated database entry SGML and STEP (ISO 10303 Standard for the Exchange of Product Data), and the STEP/SGML reference page from ECCNet.
Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).
[CR: 19971125]
Tucker, Hugh; Harvey, Betty. "STEP/SGML Standards Working Together." Page(s) 39-42 in SGML '97 Conference Proceedings. SGML Europe '97. "The Next Decade - Pushing the Envelope." Princesa Sofia Intercontinental, Barcelona, Spain. 11-15 May, 1997. Sponsored by Graphic Communications Association (GCA) and SGML Open. Conference Chair: Pamela L. Gennusa (Director, Database Publishing Systems Ltd). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 342 pages, CDROM. Author's affiliation: [Hugh Tucker]: Director, Documenta ApS, Hellerup, Denmark; Email: hugh@documenta.dk; [Betty Harvey]: President, Electronic Commerce Connection, Inc., USA; Email: harvey@eccnet.com; WWW: http://www.eccnet.com.
Abstract: "ISO 10303, Standard Exchange for Product Data (STEP), is being developed by a broad range of industries to provide extensive support for modeling, automated storage schema generation, life-cycle support, plus many more data management facilities. ISO 8879, Standard Generalized Markup Language (SGML), and the SGML family of standards, including HyTime (Hypermedia-Time-based Structuring Language, ISO 10744) and DSSSL (Document Style Semantics and Specification Language), is used for the documentation of products. These two standards, STEP and SGML, are used in the same industries and companies. STEP is used during product development and manufacturing, where as SGML products are usually created during the final processes of product development.
"This paper will discuss current initiatives in industry and government organizations for incorporating SGML product information during the beginning of the product development cycle. Several different initiatives from various corporations will be discussed. The benefits of each of the different methodologies will be discussed and analyzed.
"The STEP standard task group, Product Documentation (ISO 184/SC4/WG3/T14) is currently tasked with the responsibility for creating a methodology for the cooperation of the STEP and SGML standards. Information will be provided about how current corporate initiatives could impact and provide pertinent input in the T14 Working Group.
Note: The electronic conference proceedings in hypertext were produced by Inso Corporation (DynaText) and by High Text (EnLIGHTeN). Information about the SGML Europe '97 Conference may be found in the main database entry.
[CR: 19980906]
Tuong Dao. "An Indexing Model for Structured Documents to Support Queries on Content, Structure and Attributes.." Pages 88-97 (with 18 references) in Proceedings of the IEEE International Forum on Research and Technology Advances in Digital Libraries - ADL 1998. [Fifth] Forum on Research and Technology Advances in Digital Libraries - ADL'98. Santa Barbara, CA. April 22-24, 1988. Sponsored by IEEE Computer Society Technical Committee on Digital Libraries, Library of Congress, Alexandria Digital Library, NASA Goddard Space Flight Center, National Library of Medicine, IBM, etc. Los Alamitos, California: IEEE Computer Society Press, 1998. ISBN: 0818684666. Author's affiliation: Department of Computer Science, Royal Melbourne Institute of Technology (RMIT), Melbourne, Australia; Email: tuong@kbs.citri.edu.au.
Abstract: "The complex internal structure of documents can be described and captured by documentation representation standards such as SGML and SGML related standards like HTML and XML. The hierarchical structure of documents and the attributes of documents as well as attributes of document components at all levels of the document hierarchy can be encoded with markup tags. In traditional text database systems, only queries on content are supported. The rich structural information contained in documents and the attributes of document components are not captured in these systems, and queries on structure and attributes are not supported. We propose a text model, a query language and an indexing scheme which can support queries on content, structure, and attributes of documents as well as attributes of text elements within documents. This model is schema-independent, and query evaluation time is at worst linear. We show that our indexing scheme can efficiently support a wide range of queries in a database for highly heterogeneous collections of structured documents. We provide query examples to show how all the information encoded in documents marked up according to the TEI Guidelines, an encoding standard adopted by the humanities disciplines, can be indexed and queried in our indexing model."
IEEE Computer Society Press Order Number PR08464.
Related papers: "Indexing Structured Text for Queries on Containment Relationships", ACSC '96, Nineteenth Australasian Computer Science Conference, Melbourne, January/February 1996. Or: "Indexing Documents for Queries on Structure, Content, and Attributes," By Ron Sacks-Davis, Tuong Dao, James Thom and Justin Zobel; (RMIT) Friday, November 28, 1997, International Symposium on Digital Media Information Base (DMIB'97)
[CR: 19960402]
Tuong Dao; Sacks-Davis, Ron; Thom, James. "Indexing Structured Text for Queries on Containment Relationships." Australian Computer Science Communications 18/2 (1996) 82-91 (with 12 references). ISSN: [?]. Author's affiliation: Department of Computer Science, RMIT.
"Abstract: Documents consist of logical components such as titles and paragraphs. The complexity of the structure of documents is captured by document representation standards such as SGML. The GCL (Generalized Concordance Lists) query language has been proposed for collections of structured documents such as SGML documents. It uses containment relationships to provide a simple and effective way to formulate traditional boolean queries as well as queries specifying document structure and provides the flexibility to access, within the same database, documents which conform to multiple hierarchical structures and have different markup schemas. GCL also allows the retrieval of structural elements at any level of the document structure. The flexibility allowed by the language and its implementation comes with a significant restriction: no recursive structures are allowed. However such structures are present in many SGML documents where components are defined recursively. The paper proposes to extend GCL to allow recursive structures. An implementation framework based on an interval indexing scheme is provided to demonstrate that only small extensions are required to support recursive structures."
Based upon a paper presented at ADC '96. Seventh Australasian Database Conference, Melbourne, Victoria, Australia, 29-30 January 1996
See alternate entry.
[CR: 19960125]
Tuong Dao; Sacks-Davis, Ron; Thom, James. Indexing Structured Text for Queries on Containment Relationships. Paper to be presented at the 7th Australasian Database Conference in January [29-30] 1996. Melbourne, Australia: Department of Computer Science, RMIT, 1996. Extent: 10 pages, 12 references. Authors' affiliation: Department of Computer Science, RMIT.
"Abstract: Documents consist of logical components such as titles and paragraphs. The complexity of the structure of documents is captured by document representation standards such as SGML. The GCL (Generalized Concordance Lists) query language has been proposed for collections of structured documents such as SGML documents. It uses containment relationships to provide a simple and effective way to formulate traditional boolean queries as well as queries specifying document structure and provides the flexibility to access, within the same database, documents which conform to multiple hierarchical structures and have different markup schemas. GCL also allows the retrieval of structural elements at any level of the document structure."
"The flexibility allowed by the language and its implementation comes with a significant restriction: no recursive structures are allowed. However such structures are present in many SGML documents where components are defined recursively. This paper proposes to extend GCL to allow recursive structures. An implementation framework, based on an interval indexing scheme, is provided to demonstrate that only small extensions are required to support recursive structures." [abstract supplied by the authors]
Available in Postscript format: ftp://phobos.kbs.citri.edu.au/pub/tuong/adcpaper.ps.Z [mirror copy]. The paper will appear in the Proceedings of the 7th Australasian Database Conference, Melbourne, Australia. See alternate entry.
[CR: 19951220]
Turner, Linda. How People are Approaching Business Cases for SGML. Avalanche White Paper. Boulder, CO: Avalanche Development Company/Interleaf Inc., 1993. Extent: approximately 5 pages.
"The implementation of SGML is a strategic move that gives companies a competitive advantage, because they are able to take control of their critical information resources. Considering this factor alone, companies seem to feel that the long-term, intangible benefits outweigh the dollars spent on its implementation in the front-end. Studies of SGML implementors have taught us this Organizations who are already implementing SGML today feel that they are ahead of the game, and that other organizations who value their information resources will sooner or later turn to SGML, unless they want to fall out of the competition." [extracted]
Available online in HTML format from the Avalanche WWW server; [mirror copy].
Turner, Ron; Douglass, Tim; Turner, Audrey. README FIRST: SGML for Writers and Editors. Charles F. Goldfarb Series On Open Information Management. Englewood Cliffs, NJ: PTR Prentice Hall, [forthcoming May] 1995. ISBN: 0-13-432717-9.
Summary: "This is a non-technical introduction to SGML for writers and editors who need to work in an SGML environment. The focus is not on the technical details of the standard but rather on how writers and editors can benefit from and work effectively with SGML. Included with the book is a diskette that contains SGMLAB, a DOS-based SGML application that includes a parser and browser and numerous sample SGML documents. Using SGMLAB, readers can view on-line both the structure and output of SGML documents, and validate those documents". [publisher's pre-publication description]
See a review of the book in <TAG> by Simon Wickes. See also the review in Seybold Report on Publishing Systems25/9 (January 29, 1996) 42, and a review by Lynne Price. Also by Ron Burk. A fuller description from the publisher is also online. See also the "Prentice-Hall SGML Series" web page.
[CR: 19961112]
"[Seybold Staff.] Read It If You Must; Avoid It If You Can. [Seybold Report Review of] Turner, Ronald, README FIRST: SGML for Writers and Editors." Seybold Report on Publishing Systems 25/9 January 29, 1995 42. ISSN: 0736-7260.
The review of Turner's book README FIRST: SGML for Writers and Editors is generally unfavorable, at least with respect to prospective SRPS readership: ". . . this disappointing collaborative effort might be more appropriately subtitled 'Teaching Yourself to Read SGML'." See the review article on the Seybold WWW server.
United States Department of Energy. Office of Administration and Management. Office of Information Resources Management. Office of Scientific and Technical Information.. Electronic Exchange of Scientific and Technical Information (STI) Strategic Plan DOE/OSTI Report. Oak Ridge, TN: OSTI, Scientific and Technical Information Services Division, January, 1993. approximately 16 pages.
"On August 28, 1991, a memo from R. S. Barrow, Director of the Office of IRM Policy, Plans, and Oversight (AD-24), announced that the Standard Generalized Markup Language (SGML) as defined in Federal Information Processing Standard (FIPS) 152 is adopted as the DOE standard for accomplishing this electronic exchange. The Office of Scientific and Technical Information (OSTI) (AD-21) was given the overall responsibility for managing the adoption and transition to the use of SGML for scientific and technical documents. SGML, along with other standards such as the Government Open Systems Interconnection Profile (GOSIP), will provide a common standard for electronic document processing and exchange. . . This initiative will emphasize and enhance full life-cycle management of STI. Electronic exchange of STI will benefit users, generators, and managers of information. The full realization of the implementation of SGML will facilitate interchange among the members of the scientific and technical community by providing increased versatility of information (new ways to use information), encouraging multiple uses of information, stimulating increased use of STI, and providing more flexible access to many types of information. The potential for global electronic interchange of STI and the focus on content rather than format is expected to revolutionize the use of information." [from the document Introduction]
Available from the DOE/OSTI WWW server as ELECTRONIC EXCHANGE OF SCIENTIFIC AND TECHNICAL INFORMATION (STI) [or in mirror copy here].
United States Department of Energy. Office of Scientific and Technical Information.. Guide for Transmitting Standard Generalized Markup Language (SGML) Encoded Bibliographic Records DOE/OSTI 11865. Oak Ridge, TN: DOE/OSTI, March1995.
"This document provides the guidance necessary to transmit to the Department of Energy Office of Scientific and Technical Information (OSTI) an encoded bibliographic record that conforms to International Standard ISO 8879, Information Processing Text and office systems Standard Generalized Markup Language (SGML). Included in this document are element and attribute tag definitions, sample bibliographic records, the bibliographic document type definition, and instructions on how to transmit a bibliographic record electronically to OSTI." [from the Introduction]
Apparently still under development [June 1995]. The DTD and documentation are online. Available as a series of HTML documents from the DOE/OSTI WWW Server. See also the main bibliography page.
[CR: 19971227]
Usdin, B. Tommie. "View from the Chair." Pages 7-10 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [B. Tommie Usdin]: President, Mulberry Technologies, Inc., 17 West Jefferson Street, Suite 207, Rockville, Maryland 20850 USA; Phone: +1 301/315-9634; FAX: +1 301/315-8285; Email: btusdin@mulberrytech.com; WWW: http://www.mulberrytech.com/
Summary: Usdin's presentation was in the form of a "Welcome to SGML/XML'97." Usdin reflected on the previous SGML conferences, where both HTML and XML have been important: "We have been discussing HTML at SGML conferences since 1994. XML was publicly introduced at SGML'96, and is in the name of the conference (SGML/XML '97) in 1997."
[Excerpted:] "I'm not sure how we as a community will feel about XML by the end of this week, but coming into SGML/XML '97 I detect an attitude as different from our 1994 attitude on HTML as day is from night. We don't despise XML, we worship it. We aren't worried about the threat XML poses to SGML, we worry about the threat SGML poses to XML. We remove the word SGML from our marketing materials, our web sites, and our products. The dirty little secret is that underneath our new XML toys lies an SGML core. Shhhh. Don't tell anyone.
"There are vendors who want to be associated with XML but not SGML. They want to sponsor XML events but to avoid having their names sullied by an association with SGML. The revisionist historians are claiming five-year old SGML projects as XML experience. We've gone completely nuts over XML. At least, some of the noisiest of us have. Well, we certainly are a moody group, aren't we. It seems to me that the 1994 hysteria was unreasonable; HTML did not destroy SGML. It seems to me that the 1997 hysteria is equally unreasonable; SGML will not destroy XML.
"The relationship between SGML and XML is deep and complex. XML is SGML. And it can be because SGML has grown quickly and significantly in order to accommodate the requirements of XML. In the process, SGML has been improved for all applications, not just for XML. SGML has given XML a rational structure and discipline within which to grow and an installed base of users and tools; XML has given SGML momentum and visibility."
This paper was delivered as part of the "Introductions" track in the SGML/XML '97 Conference.
Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).
[CR: 19950716]
Usdin, Tommie; Rubinsky, Yuri. "The SGML Year in Review - 1993." <TAG> 7/1 (January 1994) 6-13. ISSN: 1067-9197.
Detailed report on SGML events in 1993. See the pointers to online copies of the report and print copies in other publications.
[CR: 19950716]
Usdin, Tommie; Rubinsky, Yuri. "The SGML Year in Review 1994." SGML Users' Group Newsletter 29 (November 1994) 3-5. ISSN: 0952-8008.
Detailed report on SGML events in 1994. See the pointers to online copies of the report and print copies in other publications.
[CR: 19970909]
Vacca, Dick. "Planning for Document Management - How to Get Started." The Gilbane Report on Open Information & Document Systems 4/1 (March - April 1996) 1-23. ISSN: 1067-8719. Author's affiliation: University of Wisconsin.
A comprehensive article on issues and concerns in planning a document management implementation. SGML is discussed at several points, including DTD design (or its equivalent) and document conversion.
[CR: 19971202]
van Dam, Andy. "Looking Back Thirty Years and Forward Three: Critical Themes in the Development of the Electronic Book [Opening Keynote Address]." Pages [?] in TEI 10: A Conference in Celebration of the Tenth Anniversary of the Text Encoding Initiative. Abstracts.. TEI 10: Text Encoding Initiative, Tenth Anniversary User Conference , Brown University, Providence, Rhode Island. November 14-16, 1997. Sponsored by Martin Hensel Corporation, Kluwer Academic Publishers, and MIT Press. Hosted by Brown University Library, and Computing and Information Services. Providence, RI: Brown University, 1977. Author's affiliation: Brown University.
See the main database entry for additional information about the conference, or the Brown University web site.
van Dam, L.; van Loenen, E. A Programmer's Interface to SGML. Technical Report. Geneva: CERN, 1989.
[CR: 19971202]
van den Hout, Erik. "Independent Links - A Maintenance Advantage?" Pages 79-83 in TEI 10: A Conference in Celebration of the Tenth Anniversary of the Text Encoding Initiative. Abstracts.. TEI 10: Text Encoding Initiative, Tenth Anniversary User Conference , Brown University, Providence, Rhode Island. November 14-16, 1997. Sponsored by Martin Hensel Corporation, Kluwer Academic Publishers, and MIT Press. Hosted by Brown University Library, and Computing and Information Services. Providence, RI: Brown University, 1977. Author's affiliation: Groningen University; Email: E.H.M.van.den.Hout@Let.RuG.NL; WWW: http://thok.let.rug.nl/evdh/.
Summary: "This paper will focus on so-called independent links (ilinks). These are links whose definitions are located separately from the document in which their link-ends reside. Use of independent links is attractive, because they might provide solutions to current maintenance problems. Additionally, some of their problems will be described. And finally, a solution for maintaining reliable links in complex, evolving documents might be found in independent links. The crucial issues related to maintenance difficulties will be the focus. The central thesis of this paper is that independent links can improve the maintenance of a hypertext, and therefore its reliability over time."
The extended abstract for the document is available online: http://www.stg.brown.edu/webs/tei10/tei10.papers/vandenhout.html; [local archive copy]. See the main database entry for additional information about the conference, or the Brown University web site.
[CR: 19950716]
van Kirk, Doug. "Getting a Grip on Unstructured Data." InfoWorld 17/21 (May 22, 1995) 51, 54.
The author discusses the role of SGML in structuring data as the notion of "corporate document" (as a central metaphor for information locus) becomes stronger within the business community.
Vanoirbeek, Christine. "Formatting Structured Tables." Pages 291-309 (with 21 references) in EP [Electronic Publishing] 92: Proceedings of Electronic Publishing, 1992 International Conference on Electronic Publishing, Document Manipulation, and Typography. Swiss Federal Institute of Technology, Lausanne, Switzerland. April 7-10, 1992. Sponsored by the Swiss Federal Institute of Technology and the Swiss National Science Foundation. Edited by Christine Vanoirbeek and Giovanni Coray [EPF, Lausanne, Switzerland]. The Cambridge Series on Electronic Publishing. Cambridge: Cambridge University Press, 1992. ISBN: 0-521-43277-4. Author affiliation: Swiss Feberal Institute of Technology, Lausanne, Switzerland.
Abstract: The objective of this paper is to analyse the problem of integrating tables with structured documents. After specific problems related to both editing and formatting activities have been analysed, an overview of different existing approaches is given. The paper emphasizes some shortcomings the usual table representations. It describes a new approach based on the distinction of logical and physical structure and argues for a multi-dimensional representation that properly integrates tables in a logical document structure. It concludes with the description of a prototype that implements these ideas.
[CR: 19970826]
Veen, Jeffrey. "XML: Metadata for the Rest of Us [Part 1]; XML: Roll Your Own Markup [Part 2]." Wired News [Technology] [?]/[?] (July 8 and July 14, 1997) .
A two-part article on XML consisting of an interview with Tim Bray, one of the chief architects of the XML (Extensible Markup Language) standard.
Summary: [Part 1]: "What if you could merge the simplicity of HTML with the flexibility of Standard Generalized Markup Language (SGML)?, [Part 2]: This week, we talk about some of the underlying workings of XML and take a look at some practical applications."
Available in HTML format: Part 1 online, and Part 2 online. [part 1 archive copy], [part 2 archive copy]
[CR: 19971227]
Vercio, LtCol Carl F. "Implementing SGML in the Office of the Secretary of Defense (OSD)." Pages 581-584 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [LtCol Carl F. Vercio]: Directorate Information Technology Officer, Directorate for Correspondence and Directives, Washington Headquarters Services, Pentagon, Washington, DC 20310; Phone: +1 (703) 697-9285; DSN 227-9285; FAX: +1 (703) 695-1219; DSN 225-1219; Email: vercio@osd.pentagon.mil.
Abstract: "The Office of the Secretary of Defense (OSD) recently adopted SGML as the standard to create a non-proprietary publishing database to produce policy and procedure documents for dissemination on the World Wide Web (WWW).
"From April 1995 to November 1996, a complete SGML subsystem was developed from a detailed library analysis through the development of DTDs and style sheets, to active production of new and revised documents. After conducting a market survey in June 1996, SoftQuad's Panorama PRO was selected to post the documents to the WWW because no conversion to HTML was necessary. Style sheets were developed and in November 1966 the first SGML-tagged DoD issuances were placed on the WWW.
"A multi-site production process has grown from these humble beginnings and is being enhanced to include electronic coordination with digital signatures and integration with the DoD electronic forms library. Along the way, many lessons were learned, that can be shared with newcomers to SGML, to make the transition to SGML easier for those who contemplate starting an SGML project."
This paper was delivered as part of the "Case Studies" track in the SGML/XML '97 Conference.
Note: The SGML/XML '97 conference proceedings volume is available from the Graphic Communications Association, 100 Daingerfield Road, Alexandria, VA 22314-2888; Tel: +1 (703) 519-8160; FAX: +1 (703) 548-2867. Complete information about the conference (e.g., program listing, tutorials, show guide, DTDs, conference reports) is provided in the dedicated conference section of the SGML/XML Web Page and via the GCA Web server. The electronic proceedings on CDROM was produced courtesy of Jouve Data Management (Jouve PubUser).
[CR: 19980931]
Vercoustre, Anne-Marie; Paradis, François. A Descriptive Language for Information Object Reuse through Virtual Documents. Paper presented at the 4th International Conference on Object-Oriented Information Systems (OOIS '97). Victoria, Australia: , [November] 1997. Authors' affiliation: Commonwealth Scientific and Industrial Research Organisation (CSIRO) Mathematical and Information Systems, Australia..
Abstract: "The importance of reuse is well recognised for electronic document writing. However, it is rarely achieved satisfactorily because of the complexity of the task: integrating different formats, handling updates of information, addressing document author's need for intuitiveness and simplicity, etc. In this paper, we present a language for information reuse that allows users to write virtual documents, where dynamic information objects can be retrieved from various sources, transformed, and included along with static information in SGML documents. The language uses a treelike structure for the representation of information objects, and allows querying without a complete knowledge of the structure or the types of information. The data structures and the syntax of the language are presented through an example application. A major strength of our approach is to treat the document as a non-monolithic set of reusable information objects."
The document is online: "A Descriptive Language for Information Object Reuse through Virtual Documents." [check re: archive copy]
See: "Reuse of Linked Documents through Virtual Document Prescriptions." By Anne-Marie Vercoustre and François Paradis [INRIA (France) and CSIRO (Australia)]. Pages 499-512 in Electronic Publishing, Artistic Imaging, and Digital Typography. Proceedings of the 7th International Conference on Electronic Publishing (EP '98), Held Jointly with the 4th International Conference on Raster Imaging and Digital Typography, RIDT '98). Saint Malo, France, March 30 - April 3, 1998. Edited by Roger D. Hersch, Jacques André, and Heather Brown. New York/Berlin/Heidelberg: Springer-Verlag, 1998. See also "A Virtual Document Approach for Reusing SGML/XML Information Objects," by François Paradis, Anne-Marie Vercoustre, and Brendan Hills; Paper presented at the SGML/XML Asia Pacific, Sydney, Australia, 22-24 September, 1997. See also: Publications related to RIO - "Reuse of Information Objects through virtual documents", and the RIO Home Page.
[CR: 19980907]
Vercoustre, Anne-Marie; Paradis, François. "Reuse of Linked Documents through Virtual Document Prescriptions." Pages 499-512 (with 24 references) in Electronic Publishing, Artistic Imaging, and Digital Typography. Proceedings of the 7th International Conference on Electronic Publishing (EP '98), Held Jointly with the 4th International Conference on Raster Imaging and Digital Typography, RIDT '98). EP '98 and RIDT '98, Saint Malo, France. March 30 - April 3, 1998. Edited by Roger D. Hersch, Jacques André, and Heather Brown. Lecture Notes in Computer Science Series, Number 1375. New York/Berlin/Heidelberg: Springer-Verlag, 1998. ISBN: 3-540-64298-6, and 3-540-64298-6. Authors' affiliation: Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia and Institut national de recherche en informatique et en automatique (INRIA), Le Chesnay, France..
Abstract: "As the WWW becomes a major source of information, a lot of interest has arisen, not only for searching for information, but for reusing this information in new pages, or directly within applications. Unfortunately HTML tags do not provide a significant level of structure for identifying and extracting information, since they are mostly used for presentation issues. Moreover the simple link mechanism of the Web does not support the controlled traversal of links to related pages. Particularly promising is the proposal for a new standard, XML, which could bring the power of SGML to the Web while keeping the simplicity of HTML. We present a system and a language that allow reusing of information from various sources, including databases and SGML-like documents, by combining it dynamically to produce a virtual document. The language uses a tree like structure for the representation of the information objects as well as link objects. The paper focuses on the selection and the traversal of XML links to extract information from linked pages. The strength of our approach is to be an SGML-compliant solution, which makes it ready to take full advantage of XML for reusing information from the Web as soon as it is widely used."
"Reusing information contained in electronic documents is becoming a major issue, whether it is proprietary information or information available from the Internet. In this paper we have presented a language for reusing information objects from heterogeneous sources, including SGML, XML, and HTML documents. Our approach is to use a middleware format to integrate the results of queries from the various sources and to map them into a new (virtual) document. We have demonstrated more specifically how to use the language for following XML links and to control the traversal of links using their types and properties. Since our solution is generic and fully SGML compatible we are ready to benefit from the intelligence that XML, or any HTML extension, will bring to the Web for supporting the extraction and reuse of information. The approach is currently being implemented using Java for the interpreter and the database server; our prototypal application is a virtual document prescription for generating activity reports that reuse information from our Intranet, an SQL database of staff and an OO database of documents. Other potential applications are: flexible and manageable generation of large documentation, or configuration of Intranet servers. Further extensions to the language would include control instructions to make the virtual document more adaptable to the actual results of queries, and explicit instructions for building a set of related pages."
Send email to François Paradis to request a paper version or electronic copy. See also: Slides, "Reuse of Linked Documents through Virtual Document Prescriptions." Or: the online presentation abstract, and the full text in PDF; [local archive copy]
[CR: 19961012]
Vercoustre, Anne-Marie; Quint, Vincent; Paoli, Jean; Vatton, Irène. Turning an Authoring Tool Wired to the Web into a Browser. Paper presented at the AUUG'& Asia-Pacific WWW '95 Conference, September 18-21, 1995 [Proceedings, 95-104]. UNRIA Rocquencourt: [copyrignt by] AUUG95 and APWWW95, October 1995. Extent: approximately 13 pages, 15 references. ISBN: 1-875781-43-9. Authors' affiliation: Grif/INRIA. WWW: Anne-Marie Vercoustre Home Page.
Abstract: "The success of the WWW came with browsers such as NCSA Mosaic that provide a user-friendly graphic interface for accessing information on the Internet. For displaying documents, all browsers parse the HTML files they receive from servers and interpret the HTML tags they contain. Accessing documents is done through link activation or by typing URL's. These features are at the core of any browser. Symposia is an SGML-based WYSIWYG authoring system that has been the first editor wired to the Web. We study in this paper how Symposia has been turned into a browser by taking advantage of its generic structured approach and its extensibility through the GATE API. More advances features for browsing are also discussed and suggested."
Available on the Internet: http://www.csu.edu.au/special/conference/apwww95/papers95/avercous/avercous.html; [mirror copy].
[CR: 19960907]
Vercoustre, Anne-Marie; Lindley, Craig A. Information Retrieval and Link Authoring in an SGML-Based Editor. INRIA Report RR-2591. Rocquencourt: INRIA, Juin 1995. Extent: approximately 14 pages. ISSN: 0249-6399. Authors' affiliation: INRIA, Domaine de Voluceau-Rocquencourt, B.P.105 78153 Le Chesnay Cedex. Email: Anne-Marie.Vercoustre@inria.fr. WWW: Anne-Marie Vercoustre Home Page.
Abstract: This document describes the integration of Grif, an SGML editor developed at INRIA and marketed by Grif, SA. with Sigma, a text retrieval tool developed by CSIRO. The integration provides Grif with flexible search and dynamic hypertext linking functions, and enhances the Sigma system to support search and display of SGML documents using a structured editor. The integration also clarifies the requirements for more generic facilities for document search, linking, and indexing for the reqpective systems as modular components of an open systems environment."
Available in Postscript format via the Internet: http://pauillac.inria.fr/~vercous/DOCS/Grif-Sigma.ps; [mirror copy].
Vignaud, Dominique. L'édition structurée des documents: SGML application à l'édition français. Paris: Éditions du Cercle de la Librarie, 1989. ISBN: 2-7654 0420-8.
This volume was prepared to assist French publishers with application of the SGML standard. It supplies a basic DTD, and additional materials are available (including electronic files) for extending the DTD. The book is said to be the first volume in a series L'édition structurée des documents, published by Éditions du Cercle de la Librarie. For availability, contact the Syndicat nationale de l'édition (SNE) or: Éditions du Cercle de la Librarie, 35 rue Grégorie-de-Tours, 75006 Paris, France. Additional details: see "SGML: application à l'édition français," SGML Users' Group Newsletter 13 (August 1989) 9; Yuri Rubinsky's brief review, "Can Imaginative Objects Have Intentions?" <TAG> 10 (July 1989) 11; or "French Book DTD Available," <TAG> 9 (March/April 1989) 15. The book is similar in purpose to the American (EPSIG/AAP) volume "Standard for Electronic Manuscript Preparation and Markup" published by NISO, and to the British volumes written by Joan Smith: Smith and Smith. Whereas the EPSIG/AAP standard for electronic publishing defined some 220 tags, Vignaud's DTD deliberately defines only 60 tags.
[CR: 19971227 MD: 19971229]
Vijghen, Philippe. "Experience of EDI for Documents: The Role of SGML." Pages 213-218 in SGML/XML '97 Conference Proceedings. SGML/XML '97. "SGML is Alive, Growing, Evolving!" The Washington Sheraton Hotel, Washington, D.C., USA. December 7 - 12, 1997. Sponsored by the Graphic Communications Association (GCA) and Co-sponsored by SGML Open. Conference Chairs: Tommie Usdin (Chair, Mulberry Technologies), Debbie Lapeyre (Co-Chair, Mulberry Technologies); Michael Sperberg-McQueen (Co-Chair, University of Illinois). Alexandria, VA: Graphic Communications Association (GCA), 1997. Extent: 691 pages, CDROM; print volume contains author and title indexes, keyword and acronym lists. Author's affiliation: [Philippe Vijghen]: Project Manager, SGML Technologies Group, ACSE sa/nv, Boulevard Général Wahis, 29, B-1030 Brussels, Belgium; Phone: +32 (2) 705.70.21; FAX: +32 (2) 705.81.01; Email: phv@acse.be WWW: http://www.sgmltech.com.
Abstract: "This paper describes the use of SGML in the EDIDOC project for the European Space Agency. The project involved the creation of a flexible framework for exchanging different types of documents, being a gateway for workflow, document conversions, security, and communication. It is used for calls for tenders, working documents, and press releases, and also covers WWW publication.
"SGML was used for m
|