The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Last modified: March 18, 2002
OpenText.org Papyrus Encoding Markup

[March 18, 2002] OpenText.org is "a web-based initiative to provide annotated Greek texts and tools for their analysis. The project aims both to serve, and to collaborate with, the scholarly community. Building upon the legacy of earlier projects that have developed grammatically tagged texts of the Greek New Testament and tools for their analysis, OpenText.org seeks to extend the levels and detail of linguistic and literary features available for searching and analysis. The project utilizes XML (the Extensible Markup Language) and related standards for the markup and processing of textual resources. The Internet provides the medium for both the development and utilization of these resources, because it allows the rapid development of new tools and facilitates the collaborative effort of project participants."

Annotation Specifications: "Following the model utilized by the W3C (World Wide Web Consortium), OpenText.org is developing a series of specifications for the annotation of Hellenistic Greek texts. Each specification covers a specific type of annotation, such as the annotation of word groups, clauses and paragraphs or the marking of textual variants. The specification documents follow a common format, providing an introduction to the scope of the annotation, definitions of linguistic and technical terms, an explanation of the features marked and details of the XML elements and attributes suggested for this purpose. The XML Document Type Definition for each kind of annotation is included at the end of the specification."

Abstract from the document on 'Character Level (Diplomatic) Papyrus Encoding': "There are a series of levels of annotation involved in the encoding of papyrus documents, from character level details through to editorial decisions regarding variant interpretations and readings. These different levels are associated with different text editions, diplomatic (character or base level), reconstructed (word divisions, accents and expansions) and reading (variant readings and interpretations). This document outlines the base diplomatic level for this annotation, which takes place at the character level. It also describes the XML elements and attributes used for this annotation..."

"Developing an XML Encoding Specification for Papyrological Analysis." By Matthew Brook O'Donnell. February 20, 2001. ['This article discusses the advantages of a machine-readable encoded edition of a papyrus manuscript over a traditional printed edition. The key issue in both mediums is the question of representation. The OpenText.org approach to this question is explained, and the initial development of encoding scheme in XML.'] "... The advantage of a machine-readable edition, and particularly one encoded in a markup language such as XML, is that it allows for the separation of encoding and display/rendering. In a printed edition the visual display format confines the encoding of data from and about the manuscript. Sub/super linear markings and notes are the main ways of including additional information to the basic character and word data. A change in either the encoding or the display requires the construction of a new edition. With an electronic text it is possible to display the encoded data in many formats, that is, one encoding resulting in many different views... The potential views of an encoded text are limited only by the quality and amount of information encoded in the base text. For example, line divisions can only be displayed if the character positions at which the breaks occur are noted in the base text. The issue of how textual and manuscript data should be represented in an encoding scheme is, therefore, of prime importance. The work of the Text Encoding Initiative (TEI) represents a considerable and highly significant advance in this area. The TEI guidelines include a number and variety of tags and suggestions for dealing with primary texts, missing characters and words, abbreviations, lacunae, the physical characteristics of a manuscript, and the like. This flexibility and the recognized status of the TEI make it an attractive candidate for encoding papyrus manuscripts, such as P.Oxy. 119. However, as an initial proposal OpenText.org has opted to develop a domain-specific XML encoding scheme, making use of recent XML linking technologies (XLink and XPointer). A printed edition of a manuscript encodes all levels of annotation in one text--character marking for clear, uncertain, illegible and missing letters, the reconstruction of uncertain and missing letters, word divisions, spelling correction and standardization, the addition of accents, and so on. In contrast, the OpenText.org papyrus annotation model aims to separate the annotation of manuscript data and editorial comment and amendment into distinct levels. The current proposal suggests three levels associated with the three editions..."

References:


Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation

Primeton

XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Globe Image

Document URI: http://xml.coverpages.org/openTextORG.html  —  Legal stuff
Robin Cover, Editor: robin@oasis-open.org