Registries and Repositories - XML/SGML Name Registration

[CR: 20030113]

[See also the slightly more maintained document XML Registry and Repository.]

The development of new vocabularies and the design of "namespace" syntaxes have increased public interest in registration authorities and authentication services which could be set up to manage name conflicts. Facilities are needed for support of globally-unique names, persistent links and resources, name (public identifier) resolution, mapping between public and system identifiers, etc. Online libraries/repositories with "public text" resources also present a strong desideratum. Several initiatives for registries and repositories have been announced. A few of the early initiatives which have been publicized are cited below, along with pointers to the SGML resources.

Note/Caution: This document contains information cribbed from all over the Net and thrown together into a crude outline. Just 'unedited notes'. Initially, it served as the basis for a report delivered to the OASIS Technical Committee, Fall 1998. Anyone who reads it might get a reasonable overview of what people were thinking about in terms of [XML] "registries" in about November 1998. This is not a maintained document. It is posted unofficially for use by the OASIS Registry and Repository Technical Committee.




The NIST Identifier Collaboration Service (NICS) project, funded through the Advanced Technology Program, has established an experimental collaborative registry for XML (Extensible Markup Language). XML is designed to enable the use of SGML on the World Wide Web. An XML DTD is the formal definition of a particular type of XML document. The DTD sets out what names can be used for elements, where they may occur, and how they all fit together. Before now, researchers and vendors have faced conflicts from similarly named XMLs and element names. Vendors can now publicly register their names and their download locations, thus avoiding future conflicts while publicizing existence and availability. This project is conducted in the Manufacturing Collaboration Technologies Group of the Manufacturing Systems Integration Division of the Manufacturing Engineering Laboratory.
For further information, contact Don Libes, (301) 975-3535

XML/EDI - XML Repository Working Group

"Repositories provide a standard reference point for understanding data and processes. In the context of XML and EDI repositories provide the means to implement strategic business systems. Repositories also provide the link between business process designs (UML) and their physical implementation (XML). The goal of the [XML Repository Working] Group is to create a proposed standard on global XML repositories for submittal to the World Wide Web Consortium (W3C), Object Management Group (OMG), and UN/EDIFACT."

"The combination of XML and EDI semantic foundations, called XML/EDI, will provide a complete framework where a set of different technologies work together to create a format that is usable by applications as well as humans. These technologies are XML, EDI, Templates, Agents and Repository. . . The repository is a location where shared Internet directories are stored and where users can manually or automatically look up the meaning and definition of XML/EDI Tags. The repository is in fact the semantic foundation for the business transactions."

The combination of XML and EDI semantic foundations, called XML/EDI, will provide a complete framework where a set of different technologies work together to create a format that is usable by applications as well as humans. These technologies are XML, EDI, Templates, Agents and Repository. These components all work together to create an XML framework for business use:

What makes XML/EDI different from previous initiatives, is that it can use the know-how of business processes captured in EDI messages. This is then delivered via the Internet and in a Web environment. Thereby the same file can be viewed by a user in a desktop tool, or can be processed by an application component on a server. One core concept this enables is that while traditional EDI is "door to door" between business partners, XML/EDI can flow in through the "door" and be used in multiple locations within organizations.


EEMA EDI/EC Work Group Proposal. "The EEMA EDI Work Group would like to propose to CEFACT the establishment of a Global Repository for the translation of XML tags in UN/EDIFACT and human language on the Internet. The EEMA EDI Working Group is prepared to assist in the set up and operation of such a repository, which could be crucial in the advancement of the use of EDI over the Internet. . . . When the fusion between ANSI X12, EDIFACT and all other EDI standards takes place in a proper way it should be under the auspices of the UN so it is global, public domain, open and available for anyone. Today this is not the case with the ANSI standards and many other EDI standards that are only available at considerable cost. Of course today the EDIFACT standard is already in the public domain and can easily be obtained through the Internet."


XML Exchange

"Welcome to CommerceNet's XML Exchange, the forum for creating and sharing document type definitions. This isn't an online magazine, and it isn't some glass tower where elite experts publish their thoughts. CommerceNet's XML Exchange is a public place for anyone to ask questions and discuss their challenges and successes in making XML work."

XML Exchange Categories: [1998]
Business productivity
Human resources
Real estate
Web Management


eCo Framework Project - Working Group

In response to the growing use of the eXtensible Markup Language (XML) and proliferation of independent XML-based eCommerce protocols, CommerceNet is chartering the eCo Framework Project and Working Group.

The eCo Framework Project is chartered by CommerceNet with sponsorship by Veo Systems Inc. and other companies to be added. The goal of the project is to develop a common framework for interoperability among XML-based application standards and key electronic commerce environments. The project's working group will develop a specification for content names and definitions in electronic commerce documents, and an interoperable transaction framework specification.

The eCo Framework Working Group is chartered to define a common framework from an ever-growing complement of electronic commerce related specifications, including Catalog Information Specification, Channel Definition Format (CDF), Common Business Library (CBL), Electronic Data Interchange (EDI), Internet Content Exchange (ICE), Open Buying on the Internet (OBI), Open Financial Exchange (OFX), Open Trading Protocol (OTP), and XML. The working group, modelled after the successful Davenport and XML working groups, includes industry experts from 3Com, American Power Conversion, Compaq/Tandem, Harbinger, Hewlett-Packard, IBM, Intel, Microsoft, NEC, Netscape, NTT, Sun Microsystems, RosettaNet, and Veo Systems.

Veo Systems Common Business Language (CBL)

CBL (Common Business Library) is an extensible, public collection of DTDs and modules that companies can customize and assemble to develop XML-based commerce applications. Because it re-uses common semantic components, CBL helps speed the development of e-commerce standards and applications-and, more importantly, facilitates their interoperation. Toward this end, Veo is sponsoring a project through CommerceNet's eCo Working Group to create a public XML registry.

Veo Systems, Inc.
2440 West El Camino Real, Seventh Floor
Mountain View, CA 94040
Phone: 650.938.8400


Connected to ICE: " - The Web's ICE information source Content-X is a news and discussion site for web publishers interested in Vignette Corporation's Information & Content Exchange (ICE) protocol. ICE is an exciting new standard designed to automate the exchange of digital assets between businesses. Content-X provides a community forum for exploring the uses of ICE, and its effect on the content syndication industry. ICE is XML-based, so we also run features on general XML topics. The ICE 1.0 specification is now a W3C NOTE."

Has a 'Syntax Repository'

On November 12, 1998, the DTDLIST.HTML document was:


ICE DTD (ICE.dtd), for news documents. To read the entire ICE specification, click here 

NITF DTD (nitf-x.dtd), for news documents. To read about it, click here 

FlixML DTD For more information about FlixML see our Technology page. 

WDDX DTD Read about what WDDX is. " - viz, 4 dtd links

Syntax Repository from "XML allows the creation of domain-specific markup languages. Markup languages will be created for different segments of the content syndication business, making the automated exchange of content even more efficient. Content-X will be creating a repository for these DTDs, schemas, and logic elements. All DTDs will be located at: Applications will be able to pull DTDs from this URL as needed."

OASIS - Organization for the Advancement of Structured Information Standards

ISO 11179

ISO 11179 might be considered 'for a technical and conceptual framework' in the OASIS RegRep TC work (TA). Note Registration of data elements (part 6) and 11179-2: Classification for Data Elements.

ISO 11179 with X3.285. DTD Element Index. DTD work from ISO 11179 and X3.285 by Terry Allen (Veo Systems); HTML presentation of ISO 11179 with X3.285 facilitated by Norm Walsh's DTDParse. NB: this is a transient URL for the ISO 11179 DTD, so please do not create public bookmarks to it.

Development of ISO/IEC 11179 - Specification and Standardization of Data Elements


ISO 11179 - Expanded name
Specification and Standardization of Data Elements

Area covered
Standard for describing data elements used in databases and documents.

Sponsoring body and standard details

ISO 11179 specifies basic aspects of data element composition, including metadata. The standard applies to the formulation of data element representations and meaning as shared among people and machines; it does not apply to the physical representation of data as bits and bytes at the machine level.

An ISO 11179 data element is composed of three parts:

The combination of an object class and a property is called a data element concept (DEC). A value domain is a set of permissible (or valid) values for a data element.

Part 2 of ISO/IEC 11179 provides procedures and techniques for associating data element concepts and data elements with classification schemes for object classes, properties and representations.

Part 3 specifies attributes of data elements. It is limited to a set of basic attributes for data elements, independent of their usage in application systems, databases, data interchange messages, etc. Part 4 provides guidance on how to develop unambiguous data element definitions.

Part 5 provides guidance for the identification of data elements, including the assignment of numerical identifiers that have no inherent meanings to humans, icons (graphic symbols to which meaning has been assigned), and names with embedded meaning. Part 6 provides instruction on how a registration applicant may register a data element with a central Registration Authority.

Usage (Market segment and penetration)
Parts 3-6 were published by ISO between 1994 and 1997, but the other parts are only available as working documents awaiting approval by ISO.

Further details available from:
ISO or local national standards body.

Metadata and Data Registries - and ISO/IEC JTC1/SC32/WG2 NWI

Some documents on the work of ISO/IEC JTC1 SC32 WG2 ('Metadata Registries') are available from NIST:

ISO/IEC Open Forum on Metadata Registries. The Open Forum is Organized by: International Organization for Standardization / International Electrotechnical Commission (ISO/IEC) Joint Technical Committee 1 (JTC 1), Subcommittee 32 - Data Management and Interchange (SC32), Working Group 2 - Metadata (WG2). "The Open Forum will be held February 16-19, 1999 in Washington DC. You are invited to register to attend. You may also participate by contributing relevant papers or web links. Participants from private enterprise, government, academia and standards organizations will discuss the development and operation of metadata registries, particularly those based on ISO/IEC 11179. Practitioners and standards developers will discuss progress in efforts to manage the content (semantics) of data that is shared within and between organizations or disseminated via the World Wide Web. Presentations and discussions will cover tutorials, implementation plans/experiences, proposed new work items for standards,and related topics."

Panel 4 - Using XML to Embed and Exchange 11179 Metadata: Panel organizers - Frank Olken and John McCarthy Lawrence Berkeley National Laboratory

The Data Registry Community intends to develop a set of XML tags for 11179 metadata, just as other professional communities have developed tagsets such as MathML and ChemML. This work will help make ISO/IEC 11179 a useful part of new XML-enabled technologies. One goal of this effort is to facilitate interoperation among metadata registries. Another goal is to enable XML applications (such as XML-EDI) to directly access metadata registries using a standard syntax and semantics. This panel brings together experts from the ISO and NCITS committees responsible for 11179, World Wide Web Consortium (W3C) committees working on XML extensions, Electronic Data Interchange committees, and the Object Management Group (OMG) to discuss the opportunities and challenges of this 11179 XML tag set development effort."

Metadata and Data Registries - Seminar

"Databases on the Web: The Role of Metadata and Data Registries." Frank Olken and John McCarthy, Lawrence Berkeley National Laboratory. SIM Information Access Seminar. XML (Extensible Markup Language) offers the promise of providing readily parsed structure for web-based information. It is widely expected that it will have a major impact on data exchange formats and web access to databases. . . We will discuss ongoing work in W3C and elsewhere to provide schema languages for XML (and a related metadata language RDF (Resource Description Framework). Our discussion will include a brief description of the data models embodied in XML, XML+Xlinks, and RDF. We will then discuss various recent proposals: RDF Schema, XML Data, DCD (Data Content Definition), OMG's XMI (Extensible Metadata Interchange), and our own work on XDT (Extensible Data Types). This work on schema specification languages is intended for use by metadata content standardization organizations such as FGDC (Federal Geospatial Data Committee) and the Dublin Core Group. We will not discuss content standards in our talk. We will also discuss related work in NCITS L8 (the U.S. TAG for ISO SC32) (formerly ANSI X3L8) on data registries, e.g, the standards ISO 11179 and ANSI X3.285. Related work in NCITS T2 (formerly X3T2) on ontology standardization will be briefly mentioned."

Initiatives are already underway to agree standard XML DTDs within specific vertical industries or markets. Many industries, for example, electronics, automotive and aerospace, have already established sophisticated SGML DTDs. There will be considerable interest in converting these to XML for use in practical Internet commerce applications. The recent formation of organisations such as RosettaNet, X-ACT (OASIS) and grass roots communities such as XML Exchange, and indicate the depth of interest in the application of XML. CommerceNet, the leading worldwide electronic commerce consortium, have recently announced that they are developing an eCommerce Registry service, which will accelerate the adoption of XML-based electronic commerce. The process will allow anyone to submit XML DTDs with immediate availability to anyone who wants access.

Brown University Scholarly Technology Group (STG)

> If you have a 3-4 sentence para, or 2-4 item list saying what
> you have implemented in prototype/alpha for the "automated"
> FPI registration facility . . .
> [ca November 1998]

STG's goal is to place online, relatively quickly, an XML-
validation/FPI-registration system - and to do it from some
neutral (educational or industry consortium) site.  Right now
the XML community is beginning to fragment.  Most of the XML
we see out on the net isn't valid.  And what little of it is
valid points to DTDs via in-house FPIs or relatively transient
URIs.  To outsiders, this makes XML appear unportable and un-
stable, and encourages them to focus merely on well-formedness
rather than on independently verifiable XML code.

By placing a solid, fast, basic "commodity" validator online,
and by coupling it with an FPI <->URI translation service,
we believe that we can help solidify people's notions of what
XML is and can do, and help provide the XML community with
stable tools to encourage verification and validation.

Thus far we have implemented an original, very fast XML 1.0
validation system.  It appears still to be the best publicly
available XML validator on the Web.  This validator has been
in beta testing for about two months now.  The source code for
this validator is written using stock tools (ANSI C, YACC,
Lex) and will compile in a number of different environments.
Ideally, this code should be released for free nonprofit use,
to encourage its use and improvement.

If our validator ends up being replaced by more sophisticated
commercial code, so much the better:  It will still have served
its purpose during this critical time.

STG has also designed an FPI registration system that:

  0) allows people to register FPI <-> URI associations (and
     that periodically checks the URIs for reachability, and
    sends mail to the FPI registrar if they aren't reachable)

  1) uses certificates where appropriate (e.g., if someone
     wants to register an FPI, they can't do it as, say, Sun
     Microsystems, unless they have a cert to back their claim

  2) that allows a registrar to grant maintenance privileges
     by password or IP address to others for specific FPIs (or
     to transfer "ownership" of an FPI to someone else entire-

  3) that uses simple HTML-based web forms and portable CGI
     scripts to accomplish its work

This FPI registration system is now undergoing alpha testing
here at STG.  I expect that we could have something useful on-
line in two months.

Given the disunity of the XML community, the relatively small
amount of valid XML that there is out there, and the trouble
we are all having figuring out what to do with XML FPIs, STG
believes it would be beneficial to put up an XML validator,
coupled with an FPI registration system - and to do it from
some neutral (but well-known, authoritative) site like OASIS.

[Richard Goerwitz]




As of November 12, 1998, the site had "Categories (9x), Other XML Schema Languages, Entity Sets, and Catalog Entries and Delegation."


--begin general
DTDs for the description of general information that defies classification elsewhere on this site

 ibtwsh: Itsy Bitsy Teeny Weeny Simple Hypertext DTD

 "This is an XML DTD which describes a subset of HTML 4.0 for embedded use within other XML
 DTDs ... It is often convenient for XML documents to have a bit of documentation somewhere in
 them. In the absence of a DTD like this one, that documentation winds up being #PCDATA only,
 which is a pity, because rich text adds measurably to the readability of documents. By incorporating
 this DTD by reference (as an external parameter entity) into another DTD, that DTD inherits the
 capabilities of this one. Using HTML-compatible elements and attributes allows the documentation
 to be passed straight through to HTML renderers."

--end general
Web, Internet, Networks
---begin web
DTDs for the description of web pages and web sites and for certain Internet networking

 The XML Bookmark Exchange Language (XBEL)

 "The XML Bookmark Exchange Language, or XBEL, is an Internet "bookmarks" interchange format. It
 was designed by the Python XML Special Interest Group on the group's mailing list.

 "The original intent was to create an interesting, fun project which was both useful and would
 demonstrate the Python XML processing software which was being developed at the time. Mark
 Hammond contributed the original idea, and other members of the SIG chimed in to add support for
 their favorite browser features. After debate which ranged far afield from the original idea,
 compromises were reached which allow XBEL to be a useful language for describing bookmark
 data for a range of browsers, including the major browsers and a number of less widely used

      XBEL page on 

 "This is a proposal for XCatalogs, a system based on SGML/Open catalogs (Socats) for translating
 XML public identifiers to XML system identifiers, which are Uniform Resource Identifiers."

      XCatalog proposal 

 Extensible Log Format (XLF)

      XLF Initiative Base 

 Process Interchange Format XML (PIF-XML)

 From the PIF-XML page: 
 "The goal of this effort is to provide an XML version of the Process Interchange Format (PIF). The
 goal assumes the stated goals of PIF."

      PIF-XML Home Page 

 WebBroker: Distributed Object Communication on the Web

 From NOTE to W3C 
 "This [NOTE to W3C] discusses XML based mechanisms for distributed object communication on
 the Web." 

      NOTE to W3C on WebBroker 


 "XML-RPC is a Remote Procedure Calling protocol that works over the Internet."

      XML-RPC Specification 

 Channel Definition Format (CDF)

 From the abstract of the specification: 
 "The Channel Definition Format is an open specification that permits a web publisher to offer
 frequently updated collections of information, or channels, from any web server for automatic
 delivery to compatible receiver programs on PCs or other information appliances." 
 Developed by Microsoft. 

      Specification (version 1.01; 1998-04-01) 
      Reference information for Channel Definition Format (CDF) elements used with
      Active Channels, Active Desktop items, and Software Update Channels. 

 WebDAV: Distributed Authoring and Versioning on the World
 Wide Web

      WebDAV IETF Working Group Web Page 

 HTTP Duplication and Replication Protocol (DRP)

 From NOTE to W3C 
 "The goal of the DRP protocol is to significantly improve the efficiency and reliability of data
 distribution over HTTP. ...[It] was designed to efficiently replicate a hierarchical set of files to a large
 number of clients." 

      NOTE to W3C on DRP 

 Wireless Markup Language (WML)

 Part of the Wireless Application Protocol (WAP) work, the Wireless Markup Language "is intended
 for use in specifying content and user interface for narrowband devices, including cellular phones
 and pagers." - From specification

      Wireless Application Protocol (WAP) Forum 
      WML Specification (1998-04-30) [PDF] 

 DMTF Common Information Model (CIM)

 "The Desktop Management Task Force (DMTF) is the industry consortium chartered with
 development, support and maintenance of management standards for PC systems and products,
 including DMI and CIM."

 "The DMTF has developed a Common Information Model (CIM) to take advantage of object-based
 management tools and provide a common way to describe and share management information
 enterprise-wide. Using HMMS as an input, the new model can be populated by DMI 2.0 and other
 management data suppliers, including SNMP and CMIP, and implemented in multiple object-based
 execution models such as JMAPI, CORBA and HMM. CIM will enable applications from different
 developers on different platforms to describe and share management data, so users have
 interoperable management tools that span applications, systems and networks, including the

      DMTF CIM XML Home Page 
--end web
DTDs for the description of software and related information.

 Open Software Description (OSD)
 From the press release: 
 "The Open Software Description (OSD) specification provides a data format or vocabulary to
 describe software components, their versions, their underlying structure and their relationships to
 other components."

      OSD Specification (1997-08-11) 
      Microsoft Press Release 

 UML eXchange Format
 From web site 
 "This project addresses how UML (Unified Modeling Language) models can be interchanged and
 proposes an application-neutral format called UXF (UML eXchange Format), which is an exchange
 format for UML models based on XML (Extensible Markup Language). It is a format powerful enough
 to express, publish, access and exchange UML models, and a natural extension from the existing
 Internet environment. It serves as a communication vehicle for developers, and as a well-structured
 data format for development tools. With UXF, UML models can be distributed universally."

      UXF Web Site 

 From web site 
 "UML-Xchange is a SGML DTD for exchanging data models between CASE tools that use the UML
 language. All of the six kinds of UML diagrams are supported."

      UML-Xchange Web Site 

 The CDIF XML-based Transfer Format
 CDIF is "a group of tool vendors, users, and system integrators sharing the vision that modelling
 tools, such as CASE tools, should understand each other and interoperate seamlessly."

 They are working on an XML-based transfer format.

      CDIF-XML Page 
--end software

Metadata, Archival, Genealogy
---meta, archiv, geneal
DTDs for metadata (data about data eg authorship, keywords, relationship to other data), the
 description of archival information and genealogical information.

 Resource Description Framework (RDF)

 From the W3C RDF Page: 
 "The Resource Description Framework (RDF) is a specification currently under development within
 the W3C Metadata activity. RDF is designed to provide an infrastructure to support metadata across
 many web-based activities. RDF is the result of a number of metadata communities bringing
 together their needs to provide a robust and flexible architecture for supporting metadata on the
 Internet and WWW. Example applications include sitemaps, content ratings, stream channel
 definitions, search engine data collection (web crawling), digital library collections, and distributed

 RDF will allow different application communities to define the metadata property set that best
 serves the needs of each community. RDF will provide a uniform and interoperable means to
 exchange the metadata between programs and across the Web. Furthermore, RDF will provide a
 means for publishing both a human-readable and a machine-understandable definition of the
 property set itself.

 RDF will use XML as the transfer syntax in order to leverage other tools and code bases being built
 around XML."

      RDF at W3C (official) 
      RDF Model and Syntax Working Draft (1998-07-20) 
      RDF Schemas Working Draft (1998-04-09) 
      W3C Metadata activity 
      Frequently asked questions 
      RDF Made (Fairly) Easy 
      RDF-DEV mailing list 
      A Discussion of the Relationship Between RDF-Schema and UML (W3C NOTE) 
      RDF at DSTC 
      Introduction to RDF Metadata 
      Metadata Architecture 
      W3C Data Formats 
      RDF Implementation in Java from IBM's Alphaworks 

 Meta Content Framework (MCF)

 From the abstract of the specification: 
 The MCF specification "provides the specification for a data model for describing information
 organization structures (metadata) for collections of networked information. It also provides a syntax
 for the representation of instances of this data model using XML, the Extensible Markup Language." 
 Developed by Netscape and others. 

      Meta Content Framework Using XML (1997-06-06)
      An MCF Tutorial 

 Web Interface Definition Language (WIDL)
 "... a meta-data syntax implemented in XML that defines Application Programming Interfaces (APIs)
 to web data and services." 
 Developed by webMethods for their automation technology. 

      Description of WIDL 
      WIDL: Application Integration with XML 

 IMS Metadata Specification
 From overview 
 The IMS Meta-data Specification is derived from extensive collaborations, requirements meetings,
 focus groups and research related to the development of meta-data specifications to support online
 learning. Groups included in the requirements process included teachers, instructional designers,
 cognitive psychologists, digital library experts, administrators of educational institutions, software
 developers, content developers, and meta-data experts.

 The basic goals of the specification are to support the following:

      Effective discovery via the Internet of high quality materials for a particular
      educational or training purpose. 
      Management of materials, including intellectual property rights, commerce, and
      customization of learning experiences. 


 Encoded Archival Description (EAD)

 A Library of Congress standard for encoding archival finding aids.

      EAD Page at Library of Congress 

 GedML: Genealogical Data in XML

 A DTD for encoding genealogical data sets in XML. Based on GEDCOM which is a wide-spread
 data format for genealogical data interchange.
      GedML Web Page 
--end meta, archiv

Multimedia, Graphics, Speech
DTDs for graphics, speech, audio, video, etc and the integration of these.

 Vector Markup Language (VML)
 "VML is an application of Extensible Markup Language (XML) 1.0 which defines a format for the
 encoding of vector information together with additional markup to describe how that information may
 be displayed and edited."

      NOTE to W3C on VML 

 Synchronized Multimedia Integration Language (SMIL)
 "SMIL (Synchronized Multimedia Integrated Language) is an open World Wide Web Consortium
 (W3C) Recommendation for the stylistic layout of multimedia presentations. SMIL defines the
 mechanism that authors can use to compose a multimedia presentation, combining audio, video,
 text, graphics and then precisely synchronize where on the screen and when these media are
 presented to the viewer."

      W3C Audio, Video and Synchronized Multimedia Overview 
      Synchronized Multimedia Activity at W3C 
      Press Release: The World Wide Web Consortium Issues SMIL 1.0 as a W3C
      Synchronized Multimedia Integration Language (SMIL) 1.0 Specification 
      SMIL Information at RealNetworks 
      Displaying SMIL Basic Layout with a CSS2 Rendering Engine (W3C NOTE) 
      Internet Draft on application/smil Media Type 

 Precision Graphics Markup Language (PGML)
 "The Precision Graphics Markup Language (PGML) is a 2D scalable graphics language designed
 to meet both the simple vector graphics needs of casual users and the precision needs of graphics
 artists. PGML uses the imaging model common to the PostScript language and Portable Document
 Format (PDF); it also contains additional features to satisfy the needs of Web applications."

      NOTE to W3C (1998-04-10) 

 Java Speech Markup Language (JSML)

 The Java Speech Markup Language (JSML) is used by applications to annotate text input to Java
 Speech API speech synthesizers. The JSML elements provide a speech synthesizer with detailed
 information on how to say the text. JSML includes elements that describe the structure of a
 document, provide pronunciations of words and phrases, and place markers in the text. JSML also
 provides prosodic elements that control phrasing, emphasis, pitch, speaking rate, and other
 important characteristics. Appropriate markup of text improves the quality and naturalness of the
 synthesized voice. JSML uses the Unicode character set, so JSML can be used to mark up text in
 most languages of the world.

      JSML Specification

 "The VoxML markup language for voice applications allows developers to simply and easily add
 speech interfaces to their Web applications or content."
      Motorola's VoxML Site 
end mult
Commerce, Finance, Business Information
-comm, fin. buss
DTDs for financial transactions and the interchange of business information.

 Open Financial Exchange

 "Open Financial Exchange is a data format designed to represent financial information exchanged
 between an online financial services server and a client software product. This financial data is sent
 back and forth between the client and server via the Internet using the Open Financial Exchange

 The Open Financial Exchange specification, enables financial institutions and brokerage firms to
 implement online connectivity for both personal financial management (PFM) software products like
 Microsoft Money or Intuit's Quicken and to build dynamic web sites. Open Financial Exchange
 supports transactions for banking, credit, brokerage and mutual fund markets. It is designed to
 support a wide range of financial activities including consumer and small business banking;
 consumer and small business bill payment; investments, including stocks, bonds, mutual funds."

      OFX: Home Page
 From the XML/EDI Group Home Page: 
 "XML/EDI provides a standard framework/format to describe different types of data -- for example, an
 invoice, healthcare claim, project status -- so that the information be it in a transaction, catalog or a
 document in a workflow can be searched, decoded, manipulated, and displayed consistently and
 correctly by implementing EDI dictionaries. Thus by combining XML and EDI we create a new
 powerful paradigm!" 

      XML/EDI Group 

 Open Trading Protocol (OTP)

      OTP Website
 Information & Content Exchange (ICE),1669,5226,00.html (FAQ)
--end com, fin, bus

Scientific, Technical
--- begin scientif
DTDs for scientific and technical information.

      W3C Math Overview 
      Press Release: W3C Issues MathML as a Recommendation 
      MathML Recommendation 

 Chemical Markup Language (CML)

 From the CML Home Page: 
 "Chemical Markup Language is a radical new venture in molecular information and provides a
 simple yet powerful way to manage a very wide range of problems with a single language." 
 Developed by Peter Murray-Rust. 

      CML Site
 Bioinformatic Sequence Markup Language (BSML)
      Request for Comment 

 Telecommunication Interchange Markup (TIM)
      Telecommunications Industry Forum - Information Products Interchange (TCIF -
      IPI) Committee Home Page 
----end scientif

DTDs for the description of educational material.

 IMS Metadata Specification
 From overview 
 The IMS Meta-data Specification is derived from extensive collaborations, requirements meetings,
 focus groups and research related to the development of meta-data specifications to support online
 learning. Groups included in the requirements process included teachers, instructional designers,
 cognitive psychologists, digital library experts, administrators of educational institutions, software
 developers, content developers, and meta-data experts.

 The basic goals of the specification are to support the following:

      Effective discovery via the Internet of high quality materials for a particular
      educational or training purpose. 
      Management of materials, including intellectual property rights, commerce, and
      customization of learning experiences. 


 Tutorial Markup Language (TML)
      TML Language Specification 
---end Education

Language, Knowledge Representation
---lang and KR:
"DTDs for knowledge representation and the description of linguistic information."
4x on Thursday

    Translation Memory eXchange (TMX)

         TMX Specification
    Ontology Markup Language (OML)
         OML Web Page
    Conceptual Knowledge Markup Language (CKML)

         CKML Web Page 

         OpenTag Specification 

----------">Other XML Schema Languages

 Alternative XML schema languages to XML DTDs.


 From the abstract of the specification: 
 XML-Data is "a specification ... for exchanging structured and networked data on the Web. This
 specification uses XML, the Extensible Markup Language, for describing data, as well as data about
 data. We expect XML-Data to be useful for a wide range of applications, such as describing
 database transfers, digital signatures, or remotely-located Web resources." 
 Developed by Microsoft and others. 
 Now superceded by DCD (see below) 

      Specification for XML-Data (1997-12-11) 


 From the XSchema page 
 "The XSchema specification, when complete, will provide a means for XML developers to describe
 their XML document structures using XML document syntax."

      Simon St.Laurent's XSchema page 

 Document Content Description for XML

 From the NOTE to W3C 
 "This document proposes a structural schema facility, Document Content Description (DCD), for
 specifying rules covering the structure and content of XML documents. The DCD proposal
 incorporates a subset of the XML-Data Submission [XML-Data] and expresses it in a way which is
 consistent with the ongoing W3C RDF (Resource Description Framework) [RDF] effort; in particular,
 DCD is an RDF vocabulary. DCD is intended to define document constraints in an XML syntax;
 these constraints may be used in the same fashion as traditional XML DTDs. DCD also provides
 additional properties, such as basic datatypes."

      NOTE to W3C on dcd 

 Schema for Object-oriented XML (SOX)

 From the NOTE to W3C 
 "SOX provides an alternative to XML DTDs for modeling markup relationships to enable more
 efficient software development processes for distributed applications. SOX also provides basic
 intrinsic datatypes, an extensible datatyping mechanism, content model and attribute interface
 inheritance, a powerful namespace mechanism, and embedded documentation. As compared to
 XML DTDs, SOX dramatically decreases the complexity of supporting interoperation among
 heterogenous applications by facilitating software mapping of XML data structures, expressing
 domain abstractions and common relationships directly and explicitly, enabling reuse at the
 document design and the application programming levels, and supporting the generation of
 common application components."

      NOTE to W3C on SOX">Entity Sets

    The following are character entity sets in common use:

    ISO Entities

    Courtesy of Rick Jelliffe

         ISO 8879:1986//ENTITIES Added Latin 1//EN//XML 
         ISO 8879:1986//ENTITIES Added Latin 2//EN//XML 
         ISO 8879:1986//ENTITIES Numeric and Special Graphic//EN//XML 
         ISO 8879:1986//ENTITIES Publishing//EN//XML 
         ISO 8879:1986//ENTITIES General Technical//EN//XML 
         ISO 8879:1986//ENTITIES Diacritical Marks//EN//XML 
         ISO 9573-15:1993//ENTITIES Greek Letters//EN//XML 
         ISO 9573-15:1993//ENTITIES Monotoniko Greek//EN//XML 
         ISO 8879:1986//ENTITIES Greek Symbols//EN//XML 
         ISO 8879:1986//ENTITIES Alternative Greek Symbols//EN//XML">Catalog Entries and Delegation

 I have set up an SGML Open Catalog at
 so that people can resolve public identifiers without having to maintain their own catalogs. By
 including a delegation to this catalog or pointing to it directly, processors that support SGML
 Open Catalogs can make use of the entries included.

 As well as including direct entries, the catalog can act as a root for further delegation. If you
 produce public text (eg DTDs) and have set up your own online catalog, I am more that willing
 to add a DELEGATE entry in my catalog so that processors using my catalog will
 automatically use yours when necessary. Just contact me with your owner identifier and the
 URL of your catalog.

 NOTE: This is an experimental service. I don't guarantee anything, but please feel free to send
 me your feedback.


Tauber's Current catalog file

-- SGML Open Catalog at SCHEMA.NET        --

-- If you would like local public text or --
-- a delegation added, please email       --
--                    --

-- Local public text --
PUBLIC "ISO 8879:1986//ENTITIES Added Latin 1//EN//XML" ""
PUBLIC "ISO 8879:1986//ENTITIES Added Latin 2//EN//XML" ""
PUBLIC "ISO 8879:1986//ENTITIES Numeric and Special Graphic//EN//XML" ""
PUBLIC "ISO 8879:1986//ENTITIES Publishing//EN//XML" ""
PUBLIC "ISO 8879:1986//ENTITIES General Technical//EN//XML" ""
PUBLIC "ISO 8879:1986//ENTITIES Diacritical Marks//EN//XML" ""
PUBLIC "ISO 9573-15:1993//ENTITIES Greek Letters//EN//XML" ""
PUBLIC "ISO 9573-15:1993//ENTITIES Monotoniko Greek//EN//XML" ""
PUBLIC "ISO 8879:1986//ENTITIES Greek Symbols//EN//XML" ""
PUBLIC "ISO 8879:1986//ENTITIES Alternative Greek Symbols//EN//XML" ""

-- Delegation --
DELEGATE "-//W3C" ""

Graphic Communications Association Registry of Owner Identifiers

"This XML repository is the staging/collection area for all XML related technology. Tag sets, document type definitions (DTDs), and Extensible Stylesheet Language (XSL) templates, and other schema needed for effective communication will be collected here and eventually moved to their own sites, i.e. XSL stylesheets to, DTDs to, etc. A special site,, will be the official site for storing mathML, chemML, musicML, electronicML, scienceML, etc."



World Wide

Some Reference Documents

Syntax Issues

URIs (URNs) allow characters not directly allowed in XML, and the XML PI allows characters not generally allowed in the SGML FPI. Etc.

SGML FPI has eleven "special" chars allowed in md, XML PI allows nineteen
an additional 8x, =  ; ! * # @ $ _ %

PI in XML:

[12] PubidLiteral ::= 
        '"' PubidChar* '"' | "'" (PubidChar - "'")* "'"
[13] PubidChar ::= 
        #x20 | #xD | #xA | [a-zA-Z0-9] | [-'()+,./:=?;!*#@$_%]

- - - - - -

FPI in 8879:

[77] minimum data = 
      minimum data character [78] *

[78] minimum data character = 
      RS | (13) CR
      RE | (10) LF
      SPACE | (32) space
      LC Letter | a-z
      UC Letter | A-Z
      Digit | 0-9
      Special '( )+,-./:=?

[79] formal public identifier = 
      owner identifier [80] ,
      "//" ,
      text identifier [84]

[80] owner identifier = 
      ISO owner identifier [81] |
      registered owner identifier [82] |
      unregistered owner identifier [83]

[81] ISO owner identifier = 
      minimum data [77] 

[82] registered owner identifier = 
      "+//" ,
      minimum data [77] 

[83] unregistered owner identifier = 
      "-//" ,
      minimum data [77] 

[84] text identifier = 
      public text class [86] ,
      SPACE , (32) space
      unavailable text indicator [85] ?
      public text description [87] ,
      "//" ,
      ( public text language [88] |
      public text designating sequence [89] ),
      ( "//" ,
      public text display version [90] )? 

[87] public text description = 
      ISO text description [87.1] |
      minimum data [77] 

[87.1] ISO text description = 
      minimum data [77] 

[88] public text language = 
      name [55] 

[90] public text display version = 
      minimum data [77]

Re: A call for open source DTDs

Elliotte Rusty Harold (
Thu, 15 Oct 1998 11:17:57 -0400 

     Messages sorted by: [ date ][ thread ][ subject ][ author ] 
     Next message: Elliotte Rusty Harold: "RE: A call for open source DTDs" 
     Previous message: Scott Vanderbilt: "Re: A call for open source DTDs" 
     Next in thread: Toby Speight: "Re: A call for open source DTDs" 
     Reply: Toby Speight: "Re: A call for open source DTDs" 

Let me elaborate a little on the problem. Let us suppose we have a DTD for
plumbing. This DTD is copyright 1998 International Plumbers Association.
Can I legally place the DTD is an XML document of my own creation? The
answer is no. That would be the same as including an entire poem or other
work in my document rather than quoting a part of it.

Can I place the DTD in a separate file on my web server and reference it
like this:

<!DOCTYPE document SYSTEM "">

Again, legally, the answer is no. I cannot legally place the copyrighted
document on my server any more than I can copy a copyrighted HTML file from
another web site onto my own.

I can, however, do this:

<!DOCTYPE document SYSTEM "">

This relies on the International Association of Plumburs not changing the
URL of the plumbing DTD, not changing the DTD itself, and maintaining a web
server that's fast and accessible independently of the state of my web
server. And it completely fails for offline documents. So this isn't a
good solution.

Is open source a solution? Maybe, especially if the DTD is external to the
document. However, standard open source licenses like the GPL are
problematic because they would seem to imply that if the DTD is included
with the document itself, then the entire document must be open source.
They are one file, after all. Not even Richard Stallman tries to make all
programs compiled with gcc, open source. Neither should using an open
source DTD taint the document the DTD validates. So if we want open source
we need a new kind of open source license that's clearer about the
distinction between DTDs and documents, even though they may be present in
the same file.

The simplest solution is to simply declare that the DTD is in the public
domain. This works well with existing systems and allows anyone to use
the DTD any way they need to. The only potential downside I see to this is
that there may be some standardization problems if people are allowed to
change the DTD willy-nilly. Long-term I suspect we'll develop some standard
licensing language that allows unlimited reuse, but only if the name is
changed, perhaps something like Perl's artistic license where you can do
anything you want with it as long as you don't call it Perl.

In any case, the main thing I want to bring up is to make sure DTD authors
think about these things when writing copyright statements. If people are
going to use your DTD, they absolutely must be able to republish it.
Without special permission, standard copyright prevents that.

| Elliotte Rusty Harold | | Writer/Programmer |
| XML: Extensible Markup Language (IDG Books 1998) |
| |
| Read Cafe au Lait for Java News: |
| Read Cafe con Leche for XML News: |

xml-dev: A list for W3C XML Developers. To post,
Archived as:
To (un)subscribe, the following message;
(un)subscribe xml-dev
To subscribe to the digests, the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (