[Cache from http://www.prismstandard.org/techdev/prismspec1.asp; please use this canonical URL/source if possible.]
PRISM Home » PRISM's Progress » Spec v1 PRISM: Publishing Requirements for Industry Standard MetadataVersion 1.0 April 9, 2001 Copyright 2001, PRISM Working Group. All Rights Reserved. This specification is freely redistributable, and conforming applications may be implemented without fee. Implementations may not add any elements, attributes, or other items to the PRISM namespaces and vocabularies. All additions, amendments, and alterations must be made in other XML namespaces. For an ongoing list of known errors, workarounds, and issues for future work, please consult the errata page: http://www.prismstandard.org/errata/spec1.0/ AbstractThe Publishing Requirements for Industry Standard Metadata (PRISM) specification defines a standard for interoperable content description, interchange, and reuse in both traditional and electronic publishing contexts. PRISM recommends the use of certain existing standards, such as XML, RDF, the Dublin Core, and various ISO specifications for locations, languages, and date/time formats. Beyond those recommendations, it defines a small number of XML namespaces and controlled vocabularies of values, in order to meet the goals listed above. The PRISM working group, a joint effort of representatives from
publishers and vendors in an initiative organized under
IDEAlliance, prepared this specification. Comments for the working
group may be spec-comments@prismstandard.org. StatusThis is the 1.0 release of the PRISM Metadata Specification. It has been tested in a number of implementations, and has been reviewed by numerous external parties. The working group recommends its implementation and adoption. Implementers and reviewers of the 1.0 specification are advised
to consult
http://www.prismstandard.org/errata/spec1.0/ to obtain
corrections and updates to this specification. AcknowledgementsA number of sections were drawn from the XMLNews tutorials and
specifications. The working group thanks David Megginson for his
permission to use that material. Working Group Members (Current and former)Donald
Alameda
Sothebys.com
(Integrated Automata) Table Of Contents Part I: Introduction and Overview... 6 1.2 Relationship to Other Specifications 7 1.5 Structure of this Document10 2.1 Travel Content Syndication Scenario. 12 2.3 Embedded vs. External Metadata. 13 2.4 Controlled Vocabularies 14 2.6 Resource Type and Category. 16 2.7 Rights and Permissions. 17 3 Elements by Functional Group. 23 3.1 General Purpose Elements. 23 3.6 Rights and Permissions. 25 3.7 Controlled Vocabularies 26 Part II: Normative Specification.. 28 4.1 Requirement Wording Note 29 4.2 Behavior of PRISM-compliant Software. 29 4.3 Identifying PRISM Content29 4.4 Namespace and Vocabulary Identifiers. 29 4.6 Cardinality and Optionality. 31 4.7 Automatic Creation of Inverse Relations. 31 4.8 PRISM Profile of the Resource Description Framework. 32 5.1 XML Entities Used In Definitions. 35 5.5 PRISM Inline Markup Namespace. 56 5.6 PRISM Controlled Vocabulary Namespace. 58 6.1 Rights and Usage Vocabularies 62 6.2 Resource Type Vocabulary (presentation style) 62 6.3 Resource Category Vocabulary (intellectual genre) 65 Part I: Introduction and Overview(non-normative) IntroductionPurpose and ScopeThe Publishing Requirements for Industry Standard Metadata (PRISM) specification defines an XML metadata vocabulary for syndicating, aggregating, post-processing and multi-purposing magazine, news, catalog, book, and mainstream journal content. PRISM provides a framework for the interchange and preservation of content and metadata, a collection of elements to describe that content, and a set of controlled vocabularies listing the values for those elements. The working group focused on metadata for: General-purpose description of resources as a whole Specification of a resource’s relationships to other resources Definition of intellectual property rights and permissions Expressing inline metadata (that is, markup within the resource itself). The PRISM group’s emphasis on implementable mechanisms is key to many of the choices made in this specification. For example, the elements provided for describing intellectual property rights are not intended to be a complete, general-purpose rights language that will let unknown parties do business with complete confidence and settle their accounts with micro-transactions. Instead, it provides elements needed for the most common cases encountered when one publisher of information wants to reuse material from another. Its focus is on reducing the cost of compliance with existing contracts that have been negotiated between a publisher and their business partners. Relationship to Other SpecificationsXMLPRISM metadata documents are an application of XML [W3C-XML]. Basic concepts in PRISM are represented using the element/attribute markup model of XML. The PRISM specification makes use of additional XML concepts, such as namespaces[W3C-XML-NS]. Resource Description Framework (RDF)The Resource Description Framework [W3C-RDF] defines a model and XML syntax to represent and transport metadata. PRISM uses a simplified profile of RDF for its metadata framework. Thus, PRISM compliant applications will generate metadata that can be processed by RDF processing applications. However, the converse is not necessarily true. The behavior of applications processing input that does not conform to this specification is not defined. Dublin Core (DC)The Dublin Core Metadata Initiative [DCMI] established a set of metadata to describe electronic resources in a manner similar to a library card catalog. The Dublin Core includes 15 general elements designed to characterize resources. PRISM uses the Dublin Core and its relation types as the foundation for its metadata. PRISM also recommends practices for using the Dublin Core vocabulary. NewsMLNewsML [IPTC-NEWSML] is a standard from the International Press Telecommunications Council (IPTC) aimed at the transmission of news stories and the automation of newswire services. PRISM focuses on describing content and how it may be reused. While there is some overlap between the two standards, PRISM and NewsML are largely complementary. PRISM’s controlled vocabularies have been specified in such a way that they can be used in NewsML. The PRISM working group and the IPTC are working together to investigate a common format and metadata vocabulary to satisfy the needs of the members of both organizations. News Industry Text Format (NITF)NITF [IPTC-NITF] is another IPTC specification. NITF provides a DTD designed to mark up news stories. PRISM is a metadata vocabulary designed to describe resources and their relationship to other resources. Although NITF has some elements to specify metadata and header information that are duplicated in PRISM, the two standards are largely complementary. Where there is overlap, such as with PRISM’s inline markup, it is noted in the specification. Information and Content Exchange (ICE)The Information and Content Exchange protocol manages and automates syndication relationships, data transfer, and results analysis. PRISM complements ICE by providing an industry-standard vocabulary to automate content reuse and syndication processes. To quote from the ICE specification [ICE]: Reusing and redistributing information and content from one
Web site to another is an ad hoc and expensive process. The expense
derives from two different types of
problem: Before successfully sharing and reusing information, both
ends need a common vocabulary. Before successfully transferring any data and managing the
relationship, both ends need a common protocol and management
model. Successful content syndication requires solving both halves of this puzzle. Thus, there is a natural synergy between ICE and PRISM. ICE provides the protocol for syndication processes and PRISM provides a description of the resource being syndicated, which can be used to personalize the delivery of content to tightly-focused target markets. The two working groups have recently defined the means for PRISM to describe ICE items and for ICE to convey PRISM descriptions. RSS (RDF Site Summary) 1.0RSS (RDF Site Summary) 1.0 [RSS] is a lightweight format for syndication and descriptive metadata. Like PRISM, RSS is an XML application, conforms to the W3C's RDF Specification and is extensible via XML-namespace and/or RDF based modularization. The RSS-WG is currently developing and standardizing new modules. The primary application of RSS is as a very lightweight syndication protocol for distributing headlines and links. It is very easy to implement, but does not offer the rich negotiation and reliable delivery features of ICE. eXtensible Rights Markup Language (XrML)XrML [XRML] is a specification developed by ContentGuard, Inc. It specifies the behavior of trusted digital rights management systems and repositories. Unlike XrML, PRISM assumes that the sender and receiver of a PRISM communication already have a business arrangement that is specified in a contract. PRISM’s focus is on lowering the costs of complying with that agreement. Thus, it provides a standard means of expressing common terms and conditions. XrML takes on a much harder problem, controlling the behavior of end-user applications and devices such as printers and tape drives to prevent unauthorized reuse of the content. PRISM specifies as little as possible about the internal behavior of systems. Thus, PRISM’s treatment of derivative use rights is complimentary to, but separate from, the rights and uses that are specified in XrML. XTM (XML Topic Maps)XTM is an XML representation of ISO Topic Maps [ISO-13250], an approach for representing topics, their occurrences in documents, and the associations between topics. This is very similar to PRISM’s use of controlled vocabularies. XTM documents require that topics use a URI as a unique identifier. PRISM descriptions can directly cite XTM topics when there is a need to use them where PRISM allows values from controlled vocabularies. There is also a simple mapping between the XTM format and the PRISM group’s simple XML format for controlled vocabularies. Additional IssuesRedundancyRedundancy is a necessary consequence of re-using existing work. For example, when sending PRISM data in an ICE payload, there will be duplication of PRISM timestamp information and ICE header data. Therefore, in some cases, the same information will be specified in more than one place. This is normally a situation to be avoided. On the other hand, PRISM descriptions need to be able to stand alone, so there is no way to optimize PRISM’s content for a particular protocol. The working group decided that redundancy should neither be encouraged nor avoided. Exchange MechanismsPRISM specifies a file format, and does not define or impose any particular exchange mechanism. There are many ways to exchange the descriptions and the content they describe. Developers of such exchange protocols should consider the following factors: Easily separable content: A tool that provides metadata will need to get at this information quickly. If metadata is mixed with content, these tools will have to always scan through the content. Reference vs. Inline content: Referencing content is visually clean, but presents a challenge with access (security, stale links, etc). Inline requires larger data streams and longer updates in the face of changes. Encoding. Depending on the choice of format, encoding of the content may be necessary. Extra computation or space will be needed. SecurityThe PRISM specification deliberately does not address security issues. The working group decided that the metadata descriptions could be secured by whatever security provisions might be applied to the resource(s) being described. PRISM implementations can achieve necessary security using a variety of methods, including: Encryption at the transport level, e.g., via SSL, PGP, or S/MIME. Sending digitally signed content as items within the PRISM interchange format, with verification performed at the application level (above PRISM). Rights EnforcementThe PRISM specification does not address the issue of rights enforcement mechanisms. The working group decided that the most important usage scenarios at this time involved parties with an existing contractual relationship. This implied that the most important functionality required from PRISM’s rights elements was to reduce the costs associated with clearing rights, not to enable secure commerce between unknown parties. Therefore the PRISM specification provides mechanisms to describe the most common rights and permissions associated with content, but does not specify the means to enforce compliance with those descriptions. Essentially, the goal is to make it less expensive for honest parties to remain honest, and to let the courts serve their current enforcement role. DefinitionsThe following terms and phrases are used throughout this document in the sense listed below. Readers will most likely not fully understand these definitions without also reading through the specification.
Structure of this DocumentThe document is organized into two parts, plus an appendix. Part 1 is non-normative, providing an introduction to, and tutorial overview of, the specification. Despite being non-normative, there are occasional statements using the key words MUST, SHOULD, MAY, etc. Those statements will be repeated in Part 2, the normative portion of the specification. Part 1 contains three sections. Section 1 provides this general introduction and establishes some of the context for the PRISM specification. Section 2 provides a tutorial for the major features of the spec, using a series of examples around a common scenario. Section 3 provides a quick reference to the elements defined in the specification, organized by functional group. Because elements can be used for multiple functions, they may be repeated in multiple tables. Part 2 also contains three sections. Section 4 describes PRISM’s framework for identifiers, its profile (restricted subset) of RDF, and various other normative requirements on instances of the PRISM format. Section 5 gives normative definitions for the XML elements and attributes in the namespaces PRISM defines. Non-normative definitions, along with PRISM-recommended cataloging rules, are provided for the XML elements and attributes from namespaces PRISM recommends, but does not define, such as the Dublin Core. Section 6 defines vocabularies that PRISM uses as controlled values for various properties. Appendix A provides a bibliography, which is also divided into normative and non-normative sections. OverviewThis section provides a non-normative overview of the PRISM specification and the types of problems that it addresses. It introduces the core concepts and many of the elements present in the PRISM specification by starting with a basic document with Dublin Core metadata, then using PRISM metadata elements to create richer descriptions of the article. Although the PRISM specification contains a large number of elements and controlled vocabulary terms, most of them are optional. It is not necessary to put forth a large amount of effort to apply metadata to every resource, although it is possible to apply very rich metadata to resources whose potential for reuse justifies such an investment. Travel Content SyndicationScenarioWanderlust, a major travel publication, has a business relationship with travelmongo.com, a travel portal. After Wanderlust goes to press, they syndicate all of their articles and sidebars to content partners like travelmongo.com. Like many other publications, Wanderlust does not have the right to resell all of their images, because some of them have been obtained from stock photo agencies. When Wanderlust creates syndication offers, an automated script searches through the metadata for the issue’s content to ensure that anything that cannot be syndicated is removed from the syndication offer with alternatives substituted when possible. Since Wanderlust tags their content with rights information in a standard way, this process happens automatically using off-the-shelf software. Because Wanderlust includes standard descriptive information about people, products, places and rights when they syndicate their content, travelmongo.com can populate their content management system with all the appropriate data so that the articles can be properly classified and indexed. This reduces the cost to travelmongo.com of subscribing to third party content and makes content from Wanderlust even more valuable for them. Basic MetadataThe elements in the Dublin Core form the basis for PRISM’s metadata vocabulary. This simple PRISM document uses some Dublin Core elements to describe a photo taken on the island of Corfu: <?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://wanderlust.com/2000/08/Corfu.jpg"> <dc:identifier
rdf:resource="http://wanderlust.com/content/2357845"
/> <dc:description>Photograph taken at 6:00 am
on Corfu with two models
</dc:description> <dc:title>Walking on the Beach in
Corfu</dc:title> <dc:creator>John
Peterson</dc:creator> <dc:contributor>Sally Smith,
lighting</dc:contributor>
<dc:format>image/jpeg</dc:format> </rdf:Description> </rdf:RDF> PRISM descriptions are XML documents [W3C-XML], thus they begin with the standard XML declaration: <?xml version=”1.0”?>. A character encoding may be given if needed. As indicated by the two attributes beginning with ‘xmlns:’, PRISM documents use the XML Namespace mechanism [W3C-XML-NS]. This allows elements and attributes from different namespaces to be combined. Namespaces are the primary extension mechanism in PRISM. PRISM-compliant applications MUST NOT throw an error if they encounter unknown elements or attributes. They are free to delete or preserve such information, although recommended practice is to retain them and pass them along. PRISM descriptions are compliant with the RDF constraints on the XML syntax. Thus, they begin with the rdf:RDF element. PRISM requires that resources have unique identifiers. In the above example, the photo is identified by a URI in the rdf:about attribute of the rdf:Description element. The dc:identifier element can be used for other identifiers, such as ISBN numbers or system-specific identifiers. In the above example, the dc:identifier element contains an asset ID for Wanderlust’s asset management system. PRISM follows the case convention adopted in the RDF specification. All elements, attributes and attribute values typically begin with an initial lower case letter, and compound names have the first letter of subsequent words capitalized. Element types may begin with an uppercase letter when they denote Classes in the sense of the RDF Schema [W3C-RDFS]. Only one of the elements in the PRISM namespace, pcv:Descriptor, does so. PRISM uses a simple naming convention. We avoid abbreviations, use American English spelling, and make the element names into nouns (or pseudoNounPhrases, because of the case convention) in singular form. PRISM uses the convention of placing property values that are URI references, such as in the dc:identifier element in the example above, in the rdf:resource attribute. Prose or non-URI values are given as element content, as seen in the dc:description element. This allows automated systems to easily determine when a property value is a URI reference. Embedded vs. External MetadataFor the most part, PRISM assumes that its descriptions are transferred as complete, standalone, XML documents that describe other files. But it is also possible to embed PRISM descriptions in a file. The example below shows a sample of a simple XML file, which contains an embedded PRISM description <?xml version="1.0" encoding="UTF-8"?> <doc> <p>Fourscore and seven years ago, our fathers brought forth on this continent a new nation, conceived in liberty, and dedicated to the proposition that all men are created equal. </p> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about=""> <dc:description>Start of the Gettysburg Address</dc:description> <dc:creator>Abraham Lincoln</dc:creator> </rdf:Description> </rdf:RDF> </doc> A Brief Digression on IdentifiersNote that the empty string is given as the value of the rdf:about attribute. This means that the PRISM description is about the current file. The value of the rdf:about attribute is required to be a URI reference – either absolute or relative. By definition, relative URIs are relative to an absolute URI known as the base. By default, that base URI is the URI of the containing document. So, in this case, the relative URI reference is the empty string, meaning that it does not modify the base URI. Therefore, the rdf:about attribute refers to the current document. It is also possible to use the new xml:base attribute[W3C-XML-BASE] to set the base URI reference. That attribute will be used in several examples in this document. However, readers are cautioned that the XML BASE specification is not yet a full Recommendation of the W3C, although it seems very likely to be passed in its current form. Readers are also cautioned that because it is so new, very few XML implementations will support it at this time. Therefore, creators of PRISM descriptions should be cautious about using it for the near future. A Brief Digression on IntentThis example illustrates another important point. Note that the name given in the dc:creator element is “Abraham Lincoln”, not the name of the person who actually created the XML file and entered Lincoln’s famous line into it. There are applications, such as workflow, quality assurance, and historical analysis, where it would be important to track the identity of that individual. However, none of those are problems PRISM attempts to solve. PRISM’s purpose is to describe information for exchange and reuse between different systems, but not to say anything about the internal operations of those systems. The PRISM working group decided that workflow was an internal matter. This focus on a particular problem allows PRISM descriptions to avoid some thorny issues that more general specifications must address. Controlled VocabulariesProperty values in PRISM may be strings, as shown above, or may be terms from controlled vocabularies. Controlled vocabularies are an important extensibility mechanism. They also enable significantly more sophisticated applications of the metadata. As an example, consider the two Descriptions below. The first provides a basic, human-readable, value for the dc:creator element, telling us that the Corfu photograph was taken by John Peterson. The second example appears harder to read, because it does not give us John Peterson’s name. Instead, it makes reference to John Peterson’s entry in the employee database for Wanderlust. <rdf:Description rdf:about="http://wanderlust.com/2000/08/Corfu.jpg"> <dc:creator>John
Peterson</dc:creator> ... </rdf:Description> <rdf:Description rdf:about="http://wanderlust.com/2000/08/Corfu.jpg"> <dc:creator rdf:resource=”http://wanderlust.com/emp3845”> ... </rdf:Description> That employee database is an example of a controlled vocabulary – it keeps a list of terms (employee names). It has a defined and controlled update procedure (only authorized members of the HR department can update the employee database, and all changes are logged). It uses a unique identification scheme (employee numbers) to handle the cases where the terms are not unique (Wanderlust might have more than one employee with a name like “John Peterson”). It can associate additional information with each entry (salary, division, job title, etc.) The unique identifier is one of the keys to the power behind the use of controlled vocabularies. If we are given metadata like the first example, we are limited in the types of displays we can generate. We can list Wanderlust’s photographs, sorted by title or by author name. By using the employee database, we can generate those, but also lists organized by department, job title, salary, etc. We also avoid the problems around searching for common names like “John Smith”, dealing with name changes such as those due to marriage and divorce, and searching for items that have been described in other languages. Finally, content items are easier to reuse if they have been coded with widely adopted controlled vocabularies, which increases their resale value. Defining additional vocabularies for specialized uses is a way to extend descriptive power without resorting to prose explanations. This makes them far more suited to automatic processing. PRISM specifies controlled vocabularies of values for some elements. Others elements will use controlled vocabularies created and maintained by third parties, such as the International Standards Organization (ISO). Site-specific controlled vocabularies, such as from employee or customer databases, may also be used at the risk of limiting interoperability. As another example, we can denote the location shown in the photograph by using the ISO country codes vocabulary: <rdf:Description rdf:about="http://wanderlust.com/2000/08/Corfu.jpg"> <dc:identifier rdf:resource="http://wanderlust/content/2357845" /> ... <dc:coverage rdf:resource="http://prismstandard.org/vocabs/ISO-3166/GR" /> </rdf:Description> Definition of Controlled VocabulariesPRISM provides a small namespace of XML elements so that new controlled vocabularies can be defined. For example, Wanderlust might have prepared an exportable version of their employee database that contained entries like: <pcv:Descriptor
rdf:ID=”emp3845”> <pcv:code>3845</pcv:code> <pcv:label>John Peterson</pcv:label> <hr:hireDate>1995-2-22</hr:hireDate> <hr:division>Photography</hr:division> <hr:manager rdf:resource=”emp2234”/> </pcv:Descriptor> <pcv:Descriptor rdf:ID=”emp4541”> <pcv:code>4541</pcv:code> <pcv:label>Sally Smith</pcv:label> <hr:hireDate>1999-12-02</hr:hireDate> <hr:division>Photography</hr:division> <hr:manager rdf:resource=”emp3845”/> </pcv:Descriptor> ... These entries use elements from the Prism Controlled Vocabulary (PCV) namespace for information important to the controlled vocabulary nature of the entries – the employee name and the employee ID. The PCV namespace also includes other elements so it can represent basic hierarchical taxonomies. The PCV namespace is not intended to be a complete namespace for the development, representation, and maintenance of taxonomies and other forms of controlled vocabularies. Other vocabularies, such as XTM or VocML, may be used for such purposes. As long as URI references can be used to refer to the terms defined in these other markup languages, there is no problem is using them in PRISM descriptions. The sample descriptions above also mix in elements from a hypothetical Human Resources (hr) namespace. Providing that information enables useful functions, such as sorting the results by division or by manager, etc. The hr namespace is only an example, provided to show how elements from other namespaces may be mixed into PRISM descriptions. Internal Description of Controlled VocabulariesLinking to externally-defined controlled vocabularies is a very useful capability, as indicated by the range of additional views described in the earlier example. However, external vocabularies do require lookups in order to fetch that information, which may make common operations too slow. PRISM also allows portions of a vocabulary entry to be provided within a description that uses them, similar to a caching mechanism. For example, the PRISM description of the Corfu photo can be made more readable, while still allowing all the power that comes from controlled vocabularies, by providing some of the information inline: <?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:pcv="http://prismstandard.org/namespaces/pcv/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/" xml:base=”http://wanderlust.com/”> <rdf:Description rdf:about="/2000/08/Corfu.jpg"> <dc:identifier rdf:resource="/content/2357845" /> <dc:creator> <pcv:Descriptor rdf:about="/emp3845"> <pcv:label>John Peterson</pcv:label> </pcv:Descriptor> </dc:creator> <dc:coverage> <pcv:Descriptor rdf:about="http://prismstandard.org/vocabs/ISO-3166/GR"> <pcv:label xml:lang="en">Greece</pcv:label> <pcv:label xml:lang="fr">Grece</pcv:label> </pcv:Descriptor> </dc:coverage> </rdf:Description> </rdf:RDF> This approach uses the pcv:Descriptor element, which is a subclassof rdf:Descriptor, indicating that the resource is a taxon in a controlled vocabulary. Notice it also uses the rdf:about attribute, instead of the rdf:ID attribute, which means that we are describing the taxon, not defining it. The actual definitions of those terms are maintained elsewhere. RelationsIt is often necessary to describe how a number of resources are related. For example, an image can be part of a magazine article. PRISM defines a number of elements to express relations between resources, so describing that this image is part of a magazine article can be done as follows: <rdf:Description rdf:about="http://wanderlust.com/2000/08/Corfu.jpg"> <dc:identifier rdf:resource="http://wanderlust.com/content/2357845" /> ... <prism:isPartOf rdf:resource= ”http://wanderlust.com/2000/08/CorfuArticle.xml” /> </rdf:Description> It is possible, but not mandatory, to add a statement to the description of the Corfu article saying that it contained the image: <rdf:Description rdf:about="http://wanderlust.com/2000/08/CorfuArticle.xml"> ... <prism:hasPart rdf:resource=”http://wanderlust.com/2000/08/Corfu.jpg” /> </rdf:Description> Resource Type and CategoryMany different kinds of information are frequently lumped together as information about the 'type' of a resource. The PRISM specification breaks out three components: First, file formats are indicated through the use of Internet Media Types (aka MIME types [RFC-2046]) in the dc:format element. Second, information on the stereotypical type of intellectual content, such as obituaries vs. election results, is indicated through the use of the prism:category element and the controlled vocabulary presented in Table 17: Categories (intellectual genre). The PRISM group found that these two were not all the types commonly used. Many ‘types’ commonly used, such as tables, charts, sidebars, etc. are not intellectual genre, they are stereotypical modes of presentation. As an example, election results could be presented in a table, a map, or many other ways. The type of presentation used in a resource is indicated by the dc:type element and the values listed in Table 16: Controlled Vocabulary of Presentation Styles. For example, consider three different images – a JPEG photograph of a landscape, a PNG image of a political cartoon, and a PNG image of a graph from a financial statement. Table 1: Sample of Image ‘Types’ shows how those facts would be recorded in PRISM descriptions. Distinguishing these various facets will be helpful in advanced searching applications. Table 1: Sample of Image ‘Types’
Rights and PermissionsLicensing content for reuse is a major source of revenue for many publishers. Conforming to licensing agreements is a major cost – not only to the licensee of the content but also to the licensor. For these reasons, PRISM provides elements and controlled vocabularies for the purpose of describing the rights and permissions granted to the receiver of content. The PRISM specification provides those elements in two namespaces. Basic, commonly used, elements are defined as part of the PRISM namespace. A separate namespace is defined for the elements in the PRISM Rights Language (PRL). Since the field of Digital Rights Management (DRM) is evolving so quickly, the working group decided it would be premature to select one of the current XML standards for rights information, such as the eXtensible rights Markup Language [XrML] or Open Digital Rights Language [ODRL]. The working group expects that a rights management language will eventually become an accepted standard. It focused on specifying a small set of elements that would encode the most common rights information to serve as an interim measure for interoperable exchange of rights information. To do this, the PRISM rights language makes a couple of simplifying assumption. It assumes that the sender and receiver of content are engaged in a business relation. It may be a formal contract or an informal provision of freely redistributable content. One of the parties may not know the other. Nevertheless, a relation exists and if needed we could make up an identifier for it. PRL also assumes that its purpose is to reduce the costs of conformance to that relation. The working group explicitly rejected imposing any requirements on enforcing trusted commerce between unknown parties. Instead, the emphasis is on reducing the cost of compliance in common situations. No Rights InformationIn the example below, no rights information is provided for the Corfu photograph. Does the lack of explicit restrictions mean the sender gives the receiver permission to do everything with the image? Or does the lack of explicitly granted rights imply that they can do nothing? Neither. Instead, we rely on the assumption of an existing business relation. In the absence of specific information, parties in a PRISM transaction assume that the normal rules of their specific business relation apply. <?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:prism="http://prismstandard.org/1.0#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://wanderlust.com/2000/08/Corfu.jpg"> <dc:identifier rdf:resource="http://wanderlust.com/content/2357845" /> <dc:description>Photograph taken at 6:00 am on Corfu with two models </dc:description> <dc:title>Walking on the Beach in Corfu</dc:title> <dc:creator>John Peterson</dc:creator> <dc:contributor>Sally Smith, lighting</dc:contributor> <dc:format>image/jpeg</dc:format> </rdf:Description> </rdf:RDF> Basic Rights InformationWhile descriptions without any explicit rights information are possible, the working group decided there were some fields that were likely to be very commonly used. Those are provided in the PRISM namespace. The example below provides a copyright statementand contact information for the agency representing Wanderlust if someone wants to license the image for reuse. <?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:prism="http://prismstandard.org/1.0#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://wanderlust.com/2000/08/Corfu.jpg"> <dc:identifier rdf:resource=" http://wanderlust.com/content/2357845" /> <prism:copyright>Copyright 2001, Wanderlust Publications. All rights reserved.</prism:copyright> <prism:rightsAgent>Phantasy Photos, Philadelphia</prism:rightsAgent> </rdf:Description> </rdf:RDF> Specific Rights InformationPRISM also allows more specific information about the rights that the sender is granting to the receiver. This is a very important change in the nature of the metadata being provided. Up to now, all the metadata has been descriptive of the resource, independent of the receiver. Specific rights information, however, can only be given in the context of a particular agreement between the sender and receiver. As an example, the stock photo agency representing Wanderlust may have negotiated a contract with a licensor of the image. They could then send the image, accompanied by a description that specifically identifies that contract: <?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:prism="http://prismstandard.org/1.0#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://wanderlust.com/2000/08/Corfu.jpg"> <dc:rights rdf:resource= “http://PhillyPhantasyPhotos.com/terms/Contract39283.doc”/> </rdf:Description> </rdf:RDF> This specifically identifies the terms and conditions for reusing the image. That can make the process of manually tracking down rights and permissions a little easier since the contract number is known. It also lets software be written to enforce the terms of particular contracts. The prospect of implementing software to enforce the terms of each contract is not enticing. So, PRISM provides some simple mechanisms to accommodate common cases without specialized software. One common case is when a publisher provides a large amount of material, such as the layouts for an entire magazine issue, to a partner publisher who will republish parts of it. Much of the content in the issue will be the property of the sending publisher, and covered under their business agreement with the receiving publisher. However, the issue will also contain stock photos and other materials that are not covered by the agreement. The example below shows how the controlled value #notReusable indicates to the receiver, travelmogo.com, that this item is not covered under their agreement with the sender, Wanderlust. This is, in fact, a benefit to Wanderlust. Travelmongo.com will not ask Wanderlust staff to search for contract terms on images Wanderlust does not own – a considerable cost saving. The <rightsAgency> element is provided so that the receiver of a contact item has someone to contact should they wish to obtain the rights to use the non-Wanderlust content. The description below also shows how the descriptions for multiple objects can be packaged into a single PRISM file: <?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:prism="http://prismstandard.org/1.0#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://wanderlust.com/2000/08/Corfu.jpg"> <dc:identifier rdf:resource=" http://wanderlust.com/content/2357845" /> <prism:copyright>Copyright 2001, Wanderlust Publications. All rights reserved.</prism:copyright> <prism:rightsAgent>Phantasy Photos, Philadelphia</prism:rightsAgent> </rdf:Description> <rdf:Description rdf:about="http://SunsetSnaps.com/20456382927.jpg"> <dc:description>Sunset over Corfu</dc:description> <dc:rights rdf:resource= ”http://prismstandard.org/vocabularies/1.0/rights.xml#notReusable”/> <prism:rightsAgent>Sunset Snaps, New York</prism:rightsAgent> </rdf:Description> </rdf:RDF> The interpretation of the dc:rights statement is that the image from Sunset Snaps is governed by a specific agreement. The URI reference of that agreement is: http://prismstandard.org/vocabularies/1.0/rights.xml#notReusable. That agreement, which all PRISM-compliant systems MUST recognize, simply means that there is no agreement to reuse the image. TravelMongo is, of course, free to work out an agreement with Sunset Snaps if they want to, but they do not need to ask Wanderlust about whether they can reuse the image. Detailed Rights InformationOf course, content licensing deals are frequently more involved than an all-or-nothing arrangement. It is very common to restrict the uses by time, geography, intended use, and industry sector of use. More specialized restrictions are also possible, such as “may not be used on keychains”, but the PRISM Working Group decided there was no need to define a machine-operable way to encode such specialized restrictions. The example below shows how Wanderlust, or their agent, might restrict the length of time that TravelMongo can use the Corfu photo. <?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:prism="http://prismstandard.org/1.0#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://wanderlust.com/2000/08/Corfu.jpg"> <dc:identifier rdf:resource="http://wanderlust.com/content/2357845" /> <dc:rights rdf:parseType=”Resource”> <prism:releaseTime>2001-02-01</prism:releaseTime> <prism:expirationTime>2001-02-28</prism:expirationTime> </dc:rights> </rdf:Description> </rdf:RDF> In that example, the dc:rights element contains the elements that describe the rights and permissions. To decide which elements go inside a dc:rights element, consider if they are likely to change as a consequence of who the content is being licensed to. Copyright statements are not highly variable. Time restrictions are variable. More complex rights agreements, with multiple clauses, can also be conveyed. The description below says that the Corfu image cannot be used in the Tobacco industry, can be used in the US anytime from now on, and can be used in Greece before the end of 2003. Those three clauses are captured in the three elements within the rdf:Bag element. <?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:prism="http://prismstandard.org/namespaces/basic/1.0/" xmlns:prl="http://prismstandard.org/namespaces/prl/1.0/” xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://wanderlust.com/2000/08/Corfu.jpg"> <dc:identifier rdf:resource="http://wanderlust.com/content/2357845"/> <dc:rights xml:base="http://prismstandard.org/vocabularies/1.0/usage.xml"> <rdf:Bag> <rdf:li rdf:parseType="Resource"> <prl:usage rdf:resource="#none"/> <prl:industry rdf:resource= "http://prismstandard.org/vocabs/SIC/0132"/> </rdf:li> <rdf:li rdf:parseType="Resource"> <prl:geography rdf:resource= "http://prismstandard.org/vocabs/ISO-3166/US"/> <prism:releaseTime>2001-01-01</prism:releaseTime> </rdf:li> <rdf:li rdf:parseType="Resource"> <prl:geography rdf:resource= "http://prismstandard.org/vocabs/ISO-3166GR"/> <prism:expirationTime>2003-12-31</prism:expirationTime> </rdf:li> </rdf:Bag>
</dc:rights> </rdf:Description> Extending the PRISM Rights LanguageAs mentioned earlier, PRL is deliberately small. It can be extended by defining new elements and vocabularies to express new restrictions. New usage values could also be developed, but that is expected to be exceedingly rare. As an example, a stock image provider will have some very common usage restrictions, and some very obscure ones, that need to be applied to images they license. The most common restrictions (time, place, industry) are already covered, but two that are not covered in PRL are audience size and manipulations applied to the photograph. Our example image provider, Sunset Snaps, could define two new RDF property types (snap:audienceSize and snap:manipulations) to represent those common restrictions. They would also define vocabularies of values for the elements, such as #flip, #rotate, or #falseColor, for the snap:manipulations element. There are more obscure conditions that require human evaluation. Popular supermodels may have clauses in their contracts that prevent their images being used to advertise discount or close-out merchandise, or on inexpensive promotional items. Sunset Snaps can define a number of clauses expressing these conditions and provide them, either by reference or in-line, as shown below. <?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:prism="http://prismstandard.org/namespaces/basic/1.0/" xmlns:prl="http://prismstandard.org/namespaces/prl/1.0/” xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:snap=”http://sunsetsnaps.com/rights/”> <rdf:Description rdf:about="http://sunsetsnaps.com/Zing/asdf0838484"> <dc:identifier rdf:resource="http://wanderlust.com/content/2357845"/> <dc:rights xml:base="http://sunsetsnaps.com/rights/"> <rdf:Bag> <!-- Prohibit flips and recolorings --> <rdf:li rdf:parseType="Resource"> <prl:usage rdf:resource="#none"/> <snaps:industry rdf:resource=”#flip”> </rdf:li> <rdf:li rdf:parseType="Resource"> <prl:usage rdf:resource="#none"/> <snaps:industry rdf:resource=”#falseColor”> </rdf:li> <!-- Convey unusual conditions --> <rdf:li rdf:parseType="Resource"> <prl:usage>Not to be used with discount merchandise.</prl:usage> </rdf:li> </rdf:Bag> </dc:rights> </rdf:Description> </rdf:RDF>
Elements by Functional GroupGeneral Purpose ElementsThese elements from the Dublin Core form the basis for PRISM’s descriptive metadata. Many descriptions will need only a few elements from this table. Table 2: General Purpose Descriptive Elements
ProvenanceThese elements describe the supply chain for a resource to indicate what the source material for a resource was and through which organizations the resource has passed. PRISM uses the dc:source property to identify the original basis for the resource, the dc:publisher property to identify the primary provider of the information (such as a major wire service), and the prism:distributor property to identify other members of the distribution chain, if any. Table 3: Elements for Provenance Information
TimestampsThere are several times that mark the major milestones in the
life of a news resource: The time the story is published, the time
it may be released (if not immediately), the time it is received by
a customer, and the time that the story expires (if any). Dates and
times should be represented using the W3C-defined profile of ISO
8601 [W3C-NOTE-datetime]. Table 4: Elements for Time and Date Information
Subject DescriptionThese elements describe the subject matter of a resource.
Experience has shown that there are many different kinds of
subjects. People, places, things, events, … are all possible
subcategories of ‘subject’. Best practice is for
subject description elements to reference controlled vocabulary
terms such as the IPTC Subject Reference System. If that is not
possible, dc:subject can also contain a prose description of the
subject. Table 5: Elements for Describing the Subject of a Resource
Resource RelationshipsPublished content has a wide variety of relations to other content items. There are containment relations – such as article containing a photo, story text and caption. There are version relations – such as a resource being a corrected version of another resource. There are alternative formats – such as a Word document also existing in HTML, XML and PDF. There are alternatives – such as an image that cannot be reused having alternatives that can. Many other types of relations exist. Many of the relations provided come from work undertaken by the Dublin Core Metadata Initiative and documented in the Relations Working Draft [DCMI-R]. Table 6: Elements to Convey Relations Between Resources
Rights and PermissionsThe PRISM rights and permissions vocabulary is designed to facilitate reuse and clearance processes for parties with established business relationships by explicitly specifying the rights and/or restrictions connected with a resource. PRISM is NOT concerned with digital rights enforcement. PRISM does not specify policy or provide instructions to trusted viewers and repositories on how they should behave. PRISM also does not specify fee or payment details. Other efforts, such as XrML, are attempting to meet those needs, although there are no widely adopted solutions at this time. The design goals of rights and permissions are: To be able to describe reuse rights in a precise and consistent manner. To make simple cases such as no rights or unrestricted use simple to specify To provide the capability to indicate common types of uses or restriction. To allow for graceful evolution to future accepted standards for specifying rights. It is important to note that rights and permissions metadata is usually intended for a particular receiver, unlike elements such as “title” which are expected to be almost invariant. Table 7: Elements for Specifying Rights and Permissions Information Note that in addition to the elements summarized in the table above, the PRISM Rights Language uses a small controlled vocabulary to provide well-known values for the prl:usage element. The values in it are: Table 8: Predefined Usages
Controlled VocabulariesMany elements in PRISM-approved or PRISM-extended namespaces take values that are intended to come from controlled vocabularies. Controlled vocabularies are lists of terms that are updated through a defined and managed procedure. More formally, then entries in a vocabulary are known as taxons, since there may be more than one term used for that entry in the vocabulary. For example, “Greece” in English and “Grece” in French are two terms for the same taxon. The list of taxons may be hierarchically structured subject classification systems like the Dewey Decimal Classification, or they may be simple lists of names of companies, people, places, etc. The vocabulary may come from an external source, or be derived from internal sources such as a company's database systems. The PRISM specification provides a separate namespace of RDF Property Types for describing taxons in a controlled vocabulary. That namespace is the PRISM Controlled Vocabulary (PCV) namespace. Information about the taxon beyond that provided in the PCV namespace can be handled through the normal extension mechanism of new Property Types. Table 9: Elements for Defining and Describing Controlled Vocabulary Entries
PRISM In-line MarkupImportant information, such as dates and the names of people, places, and things, occurs in the text of an article. Some organizations prefer to mark that data in-line rather than create a large set of subject description elements. PRISM provides the following elements for inline markup. These can be mixed into DTDs that specify the allowed structure of the document. Table 10: Elements for In-Line Markup of Named Entities
Note that some of these elements, pim:quote in particular, have several attributes that provide additional information. Part II: Normative SpecificationFrameworkRequirement Wording NoteThe key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC-2119]. The PRISM specification also uses the normative term, “STRONGLY ENCOURAGES,” which should be understood as a requirement equivalent to MUST in all but the most extraordinary circumstances. Capitalization is significant; lower-case uses of the key words are intended to be interpreted in their normal, informal, English language way. Behavior of PRISM-compliant SoftwareThe PRISM specification defines the format of XML content exchanged between systems. It constrains the behavior of those systems as little as possible. Discarding metadata is discouraged but not forbidden. A major cost occurs when metadata has to be recreated after it was discarded earlier in the production process. Therefore implementations MAY retain and retransmit any information that they do not know is actually wrong. Novel elements and attributes MAY be added to PRISM descriptions. PRISM-compliant software MUST be capable of detecting such novel elements and attributes. It MUST NOT throw an error when a novel element is encountered. The PRISM working group recommends, in keeping with the recommendation above, that implementations MAY retain the novel information and pass it along. Novel elements and attributes MUST NOT be added to PRISM namespaces and vocabularies or the Dublin Core namespace. One or more new XML namespaces MUST be defined for novel elements and attributes. Identifying PRISM ContentThe Internet Media Type (aka MIME type)[IETF-MIMETYPES] for PRISM descriptions is“application/prism+rdf+xml”. When PRISM descriptions are stored as XML files, the preferred filename extension is “.prism”. When neither of those two identification methods are appropriate, the content can be scanned for occurances of the URI ”http://prismstandard.org/namespaces/1.0/basic/” used as a namespace URI in an XML documents. Such documents are considered to be PRISM content. Namespace and Vocabulary IdentifiersSystems that implement this specification MUST recognize and
support at least the first four namespaces in the table below.
Systems offering inline markup MUST support the fifth. Systems
supporting the more expressive rights language MUST support the
sixth. Systems MAY use the namespace declarations below in order to
use familiar prefixes. Table 11: Namespaces Used In PRISM Descriptions
The PRISM specification also defines a number of controlled vocabularies. The base URIs for those vocabularies are: Table 12: Base URIs for PRISM Controlled Vocabularies
All PRISM-compliant systems MUST recognize the #notReusable entry in the PRISM Rights vocabulary and handle it appropriately. In addition to the PRISM-defined vocabularies, a number of other vocabularies and data formats are recommended by PRISM as current best practice. Those are: Date-timePRISM-compliant applications sending metadata to other systems are STRONGLY ENCOURAGED to use the W3C profile of ISO 8601 [W3C-DateTime] as the format of their date and time values. Implementers are advised, however, that this specification may be supplanted in the future by one which allows features such as ranges of times, or the use of the tz library’s method of specifying time zone offsets as strings composed of Continent/City. So implementations SHOULD be able to deal with other forms. LocationsPRISM-compliant applications sending metadata to other systems are STRONGLY ENCOURAGED to use the codes from [ISO-3166] as the values for the <prism:location> and <prl:geography> elements. ISO has not yet defined a standard URI convention for those codes. In order to maximize interoperability, implementations MAY wish to use the following non-resolvable URLs. http://prismstandard.org/vocabs/ISO-3166/XX where XX is a 2-letter uppercase country code, and http://prismstandard.org/vocabs/ISO-3166-2/XX-YYY where XX is as above and YYY is a one to three-character alphanumeric subregion code. Industrial SectorPRISM-compliant applications sending metadata to other systems MAY wish to use the industry sector codes from [NAICS] as the values for the <prism:industry> element and <pim:industry>’s href attribute.
IdentifiersPRISM files use therdf:about attribute on rdf:Description elements to specify the resource being described. The value of the rdf:about attribute MUST be a URI reference [RFC-2396]. The dc:identifier element MUST be used to contain any additional identifiers to be sent, or any identifiers that cannot be represented as a URI reference. For example, a resource can be identified by a URI and by an internal asset ID that an organization would use to access it in their database. PRISM-compliant applications are STRONLY ENCOURAGED to maintain the unique identifier(s) provided for a resource. PRISM’s only policy on the assignment of identifiers is that the party assigning an identifier MUST NOT assign the same identifier to a different resource, using whatever definition of ‘different’ the assigning party deems appropriate. PRISM systems MUST regard two resources as being ‘the same’ if they have the same unique identifier. The party assigning the identifier is the sole arbiter of what they mean by ‘the same’. Note that this definition does not imply that two resources are different if their identifiers are different. Different identifiers MAY (and frequently will) be assigned to the same resource. PRISM does not require that all resources carry the same identifier through their entire lifecycle. However, if the publisher assigns a new identifier to non-reusable content obtained from an external party, the publisher SHOULD retain information on the origin and licensing of the resource so that someone later in its lifecycle can determine how to obtain the rights to reuse it. Cardinality and OptionalityAll PRISM descriptions MUST contain at least one identifier for the resource being described, expressed in the rdf:about attribute. Any number of additional identifiers MAY be expressed in dc:identifier elements. The identifier in the rdf:about attribute is the only mandatory field in a PRISM description. However, at least one other field MUST be specified in a description in order to have a meaningful model. All Dublin Core elements are optional, and may be repeated any number of times. Unless specifically noted otherwise, PRISM elements are also optional and may occur any number of times in a description. Automatic Creation of Inverse RelationsPRISM includes elements for specifying relations between resources (e.g. Resource1 isVersionOf Resource2). Those relations have inverse relations that are also in the PRISM specification (e.g., Resource2 hasVersionResource1). PRISM-compliant systems which receive one side of such a relation MAY infer the presence of the additional inverse relation. To be more specific, if the implementation tracks the origin of individual RDF statements and can segregate its database in order to undo the addition of such inferred inverses, it SHOULD infer the inverse and keep it segregated from the original input. If an implementation does not track individual statements and sources, it MAY infer the inverse relations but is cautioned about the possibility of data corruption. PRISM Profile of the Resource Description FrameworkThe Resource Description Framework (RDF) has been standardized by the W3C to provide a general framework for metadata. As such, its capabilities exceed those required by PRISM. Therefore, this document specifies a ‘profile’ – a restricted subset – of RDF that all PRISM-compliant software MUST support. This profile excludes certain capabilities of RDF that are not needed in PRISM applications, thus simplifying the development of PRISM applications. Applications conforming to the PRISM specification MUST produce correct RDF documents that can be read by any RDF-compliant software. They MUST also produce documents that conform to the PRISM profile of RDF. PRISM-compliant software does not have to be capable of processing arbitrary RDF documents. Constraint 1: Top-level structure of DescriptionsThe formal grammar for RDF [W3C-RDF] specifies: [6.1] RDF ::= ['<rdf:RDF>'] obj*
['</rdf:RDF>'] For PRISM descriptions, the rdf:RDF wrapper element is required, and its child elements are restricted to being rdf:Description elements. The production that replaces productions 6.1 and 6.2 for PRISM systems is: RDF::= '<rdf:RDF' namespace_decls '>' description+ '</rdf:RDF>' Constraint 2: rdf:aboutEachPrefix disallowedPRISM descriptions MUST NOT use the rdf:aboutEachPrefix attribute. Production [6.8] of the RDF M&S specification thus becomes: AboutEachAttr::= ' aboutEach="' URI-reference '"' Further QualificationsNo other overall restrictions in the allowed RDF syntax are specified in this section. However, implementers are advised to pay particular attention to the following points: Many elements, such as dc:subject, may take a string as a value, or may use a URI for identifying an element in a controlled vocabulary of subject description codes. The URI may be a simple reference, or may provide an inline description of the controlled vocabulary term. Implementations MUST be capable of handling all three of those cases reliably. Implementers must decide how their system will deal with unsupported descriptive elements. The PRISM specification does not preclude other descriptive elements, although their interoperation cannot be guaranteed. PRISM implementations MAY retain unknown descriptive elements and retransmit them. To aid automated processing of PRISM metadata, this specification defines a separate namespace for PRISM elements suitable for in-line markup. Thus, prism:organization is an RDF statement and pim:organization is used as in-line markup. The PRISM working group encourages implementers to keep the generated markup as simple as possible. As an example, if a work has multiple authors, RDF allows that situation to be encoded in two ways, which have slightly different meanings. The first way uses multiple dc:creator elements, each listing a separate author. The second way is to have a single dc:creator element, which then contains one of RDF’s collection constructs, such as rdf:Bag. That, in turn, would list the different authors. According to the RDF specification, the first is to be used when the authors acted as a collection of individuals in the creation of a work. The second is to be used when the authors acted as a committee. Experience has shown, however, that this distinction is too subtle for human catalogers to make reliably. The PRISM working group recommends using the first approach in most cases. Note that although a sequence of dc:creator elements in an RDF/XML file implicitly defines a sequence (in the XML world), RDF parsers have no obligation to preserve that ordering, unlike if an explicit rdf:Seq were given. PRISM implementors are advised that there are quality of implementation issues between different RDF processors. In general, implementers MAY prefer to build on top of an RDF parser that allows the original order of the statements to be reconstructed. That would allow the original order of the authors on a piece to be reconstructed, which might or might not convey additional meaning to the viewer of a styled version of the record. Similarly, XML software that can handle the almost-standardized xml:base attribute MAY be preferred. Conventions for Property ValuesTo aid in the automatic processing of PRISM documents, PRISM utilizes some conventions in expressing values of RDF properties. The values are expressed in three ways. First, a resource or an entry in a controlled vocabulary MAY be referenced with the rdf:resource attribute. For example, a book can be identified by its ISBN number as follows: <dc:identifier rdf:resource=”urn:isbn:0-932592-00-7”/> Second, human readable text MUST be is represented as element content: <dc:title>Juggling for the Complete Klutz</dc:title> barring any circumstances where representing the text in element content would change the RDF as compared to representing it as an attribute value. That element content may contain XML markup, in which case the rdf:parseType attribute MUST be given and MUST have a value of 'Literal'. Third, controlled vocabulary entries may be specified in-line. For example: <dc:subject> <pcv:Descriptor rdf:about=”http://loc.gov/LC/QA-76”> <pcv:vocabulary>Library of Congress Classification</pcv:vocabualry> <pcv:code>QA-76</pcv:code> <pcv:label>Mathematical software</pcv:label> </pcv:Descriptor> </dc:subject> XML DTDs cannot describe such a flexible content model, so no DTD is provided in this specification. Convention 1: In-line controlled vocabulary term definitions preferredPRISM descriptions make extensive use of values selected from controlled vocabularies. Conceptually, all that is needed is a reference to the vocabulary entry. But for practical considerations such as human readability, ease of use of full-text search tools, and performance, it is useful to be able to provide information about the controlled vocabulary entry, such as its human-readable label, directly in the description. The PRISM specification recommends that when this additional information is provided, that it be provided in-line, instead of as an additional rdf:Description element. For example, a story whose subject is "Mining" as defined in the North American Industrial Classification System (NAICS), would have the following description: <?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:prism="http://prismstandard.org/1.0#" xmlns:pcv="http://prismstandard.org/namespaces/pcv/1.0/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="story.xml"> <dc:subject> <pcv:Descriptor rdf:about="http://prismstandard.org/vocabs/NAICS/21"> <pcv:vocab>North American Industrial Classification System</pcv:vocab> <pcv:code>21</pcv:code> <pcv:label>Mining</pcv:label> </pcv:Descriptor> </dc:subject> </rdf:Description> </rdf:RDF> as opposed to the form of the description below, where the controlled vocabulary term is described out-of-line instead of in-line. <?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:prism="http://prismstandard.org/1.0#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="story.xml"> <dc:subject rdf:resource="http://prismstandard.org/vocabs/NAICS/21"/> </rdf:Description> <pcv:Descriptor rdf:about="http://prismstandard.org/vocabs/NAICS/21"> <pcv:vocab>North American Industrial Classification System</pcv:vocab> <pcv:code>21</pcv:code> <pcv:label>Mining</pcv:label> </pcv:Descriptor> </rdf:RDF> The two approaches are identical in terms of the RDF graph that is generated, but the former is believed easier to deal with using standard tools such as full-text indexing software or simple editing scripts. Note that we use the rdf:about attribute when providing the information on the controlled vocabulary term. This indicates that the real definition of the term is elsewhere, and we are merely providing some local descriptions of that term. Element DefinitionsThe PRISM specification recommends existing elements (in the case of the Dublin Core) or defines new elements to use for descriptive metadata. The detailed, normative, definitions of those elements is provided in this section. All the element definitions appear in a uniform format. Each element definition begins with two fields – the Name and the Identifier of the element. The Name is a human-readable string that can be translated into different languages. Also, note that PRISM does NOT require that users be presented with the same labels. The Identifier is a protocol element. It is an XML element type and MUST be given as shown, modulo the normal allowance for variations in the namespace prefix used. XML Entities Used In DefinitionsSome of the content models used in this section provide content
models that use parameter entity references. Those parameter
entities and their meaning are: Table 13: Entities Used as Abbreviations in Element Definitions
Dublin Core NamespaceThe normative definitions of the Dublin Core elements can be found in [DCMI]. The following table adds comments to indicate the use of each Dublin Core element in a PRISM document. The use of some DC elements is encouraged, others are discouraged, and others constrained. None of the Dublin Core elements are required to appear in a PRISM description, and all of them are repeatable any number of times. dc:contributor
dc:coverage
dc:creator
dc:date
dc:description
dc:format
dc:identifier
dc:language
dc:publisher
dc:relation
dc:rights
dc:source
dc:subject
dc:title
dc:type
Basic PRISM NamespaceIn addition to the Dublin Core elements, the PRISM specification defines additional namespaces. The ‘prism’ namespace contains elements suitable for a wide range of content publication, licensing, and reuse situations. Many of them are, in effect, extensions of the elements from the Dublin Core. prism:category
prism:contentLength
prism:copyright
prism:creationTime
prism:distributor
prism:event
prism:expirationTime
prism:hasAlternative
prism:hasCorrection
prism:hasFormat
prism:hasPart
prism:hasTranslation
prism:hasVersion
prism:industry
prism:isAlternativeFor
prism:isBasedOn
prism:isBasisFor
prism:isCorrectionOf
prism:isFormatOf
prism:isPartOf
prism:isReferencedBy
prism:isTranslationOf
prism:isRequiredBy
prism:isVersionOf
prism:location
prism:modificationTime
prism:object
prism:organization
prism:person
prism:publicationTime
prism:receptionTime
prism:references
prism:releaseTime
prism:requires
prism:rightsAgent
PRISM Rights LanguageThe PRISM WG put only the most commonly-needed rights elements into the PRISM namespace. For more involved treatment of rights and permissions in PRISM descriptions, elements from another namespace must be used. Because of the considerable activity around specifying rights and permissions, the PRISM working group could not recommend an existing standard to follow, as they were able to do with XML, RDF, and the Dublin Core. Therefore the working group has defined a small, simple, extensible language for expressing common rights and permissions. That language is known as the PRISM Rights Language (PRL). This section specifies that language. Note that implementations of PRISM MAY also implement PRL, but it is not mandatory. The PRISM Working Group expects PRL to be supplanted in time, once the activity around many different rights languages has settled down. Processing ModelCollections of PRL statements are known as PRL expressions. The purpose of a PRL expression is to determine if a person or organization may or may not make use of a resource in a particular way. PRL expressions evaluate to a Boolean value that indicates if a particular use is allowed (if the expression evaluates to true) or not (if the expression evaluates to false). PRL evaluation is described in RDF domain, not in the XML syntax domain. Note that PRL expressions do not describe the resource directly. They describe the real or virtual agreement under which the sender and receiver are operating. PRL expressions consist of one or more clauses. A clause, in the RDF domain, is a resource that represents a real or virtual clause in the agreement between the sender and receiver. It is the RDF subject of statements that convey the intent of the clause. In PRISM descriptions, PRL expressions MUST appear only within the scope of a dc:rights element. The dc:rights statement contains the clause, or an rdf:Bag element if there are multiple clauses. Each clause has a possibly empty set of usage statements and a possibly empty set of condition statements. If no usage is specified, the default usage is #use. (#use will be defined later in this section). If no conditions are specified, the default condition evaluates to ‘true’. Conditions evaluate to Boolean true or false. Conditions are expressed in XML using elements from the PRL namespace, such as prl:geographic and prl:industry. Two elements from the PRISM namespace, prism:releaseTime and prism:expirationTime, also express PRL conditions. To evaluate a condition, a comparison is made between the value(s) supplied in the XML element and the current state of the system or the intended use of content. The exact nature of the comparison depends on the condition being tested. True values mean that the condition applies. For example, the prism:releaseTimecondition evaluates to ‘true’ if the current system date and time is greater than or equal to the date and time specified in that element’s content. The prl:industrycondition evaluates to ‘true’ if the content is intended to be used in the specified industry. This specification does not define how the current state of the system and the intended use(s) of the content are made available for evaluating the conditions. Usages do not evaluate to Booleans. Instead, they evaluate to a set of URI references (which is typically of length 1). The URI references govern what the receiving system can do with the described resource. PRL defines only the four URI references shown in Section 6.1, Rights and Usage Vocabularies. Others can be defined, but this is expected to be an exceedingly rare form of extension. To evaluate a clause, the logical AND of the conditions in the clause is computed. If that is false, the clause evaluates to the PRL usage #notApplicable. If the logical AND is true, the set of usages in the clause is evaluated and returned as the value of the clause. To evaluate a PRL expression, all the clauses are evaluated and their results are merged according to the following rules, which MUST be applied in the following order: U, the UNION of the sets of URI references is computed. If multiple PRL expressions exist because the described resource had multiple dc:rights elements, those usages are also included in the computation of U. If #none is a member of U, the expression evaluates to false. Any special rules needed by extension elements are applied. If #use is a member of U, the expression evaluates to true. If the PRL expression evaluates to true, the resource may be used. If it evaluates to false, it may not be used. Typically, human intervention at runtime will be needed to convert the URI references, such as #permissionsUnkown, to a Boolean value. Note that because PRL defines both #none and #use, the NOT operator is not needed. PRL can be extended by defining new conditions and usages in other namespaces. Conditions MUST be defined to return a Boolean where true means the condition applies to the current state of the system or intended use of the content. Also, the conditions MUST be side-effect-free. Usages MUST return a URI reference. Another extension mechanism exists in PRL. The content model of the prl:usage element allows text content. When text content is given, implementations MUST convert it to a URI reference. This specification does not specify how that is to happen, however, a common means of doing so is expected to be showing the text to a user and asking them if the result should be #use or #none. prl:geography
prl:industry
prl:usage
PRISM Inline MarkupNamespaceMetadata is typically considered as out-of-line information. Fields such as Author, Title, and Subject are stereotypical examples of information that is descriptive of the whole of a resource and is frequently held separately from it. However, the publisher members of the PRISM working group consistently identified a need for inline markup of organizations, locations, product names, personal names, quotations, etc. Such inline metadata was needed for a number of applications. Therefore, the PRISM specification defines a namespace of XML elements and attributes for inline metadata. Developers of XML specifications for the publishing industry can use the following DTD fragment to incorporate PRISM's in-line markup elements into their DTDs. The fragment assumes that the basic textual content markup is described in another parameter entity known as %content.mix; <!-- href attribute contains an authority file reference --> <!ENTITY % inlineAttrs " href CDATA #IMPLIED"> <!ELEMENT pim:location (%content.mix; )> <!ELEMENT pim:objectTitle (%content.mix; )> <!ELEMENT pim:organization (%content.mix; )> <!ELEMENT pim:person (%content.mix; )> <!ELEMENT pim:quote (%content.mix; )> <!ATTLIST pim:person %inlineAttrs; > <!ATTLIST pim:location %inlineAttrs; > <!ATTLIST pim:objectTitle %inlineAttrs; > <!ATTLIST pim:organization %inlineAttrs; > <!ATTLIST pim:quote speakerRef CDATA #IMPLIED placeRef CDATA #IMPLIED occasion CDATA #IMPLIED date CDATA #IMPLED > pim:location
pim:objectTitle
pim:organization
pim:person
pim:quote
PRISM Controlled VocabularyNamespaceThe PRISM Controlled Vocabulary provides a mechanism for describing and conveying all or a portion of a controlled vocabulary or authority file. This may be used to define entire new taxonomies, or it may be used to optimize the final speed of the system by caching useful information from externally-held vocabularies. pcv:broaderTerm
pcv:code
pcv:definition
pcv:Descriptor
pcv:label
pcv:narrowerTerm
pcv:relatedTerm
pcv:synonym
pcv:vocabulary
Controlled VocabulariesThe specification to this point has focused on the elements and attributes that may be used in a PRISM metadata document. Elements, in effect, define the syntax of the document. To convey the meaning of a document, the values that a given element may take must also be defined. This section lists the controlled vocabularies that comprise the set of legal values for certain PRISM elements. Other elements use controlled vocabularies created and maintained by third parties (such as the ISO 3166 codes for country names). Still other elements will require some domain-specific controlled vocabulary (e.g., the North American Industrial Classification System). Media types, such as text/html or image/jpeg, provide enough information for software to render data. But activities like discovery and re-purposing demand more specific information about the role of a resource. The PRISM Specification defines two controlled vocabularies for specifying different aspects of the nature of a resource: the Resource Type and the Resource Category. It also defines a one-element vocabulary for very basic rights operations. PRL also defines a small controlled vocabulary of usages for content. Rights and Usage VocabulariesTable 14: Predefined Resource Usages in PRISM Rights Language
Table 15: Predefined Resource Usages in PRISM
Resource TypeVocabulary (presentation style)The Resource Type defines the way that a resource presents information. The Resource Type captures different information than the format of a resource, as specified using MIME types. For example, a JPEG could be a photo, line drawing, or chart. The rendering software does not care, but potential users of the content do. The Resource type is also not specific to its intellectual content (e.g. election results vs. death rates can both be rendered as JPEG charts, but not as photographs). The Resource Type values form a controlled vocabulary for the dc:type element. The URI for the PRISM resource type vocabulary is: http://prismstandard.org/vocabularies/1.0/resourcetype.xml. The PRISM resource type vocabulary is largely drawn from the
print medium. Presentations that are idiomatic to film, audio,
animation, and other mediums are only thinly represented.
Organizations interested in describing items in such media may wish
to consult the Art and Architecture Thesaurus [AAT]. Table 16: Controlled Vocabulary of Presentation Styles
Resource CategoryVocabulary (intellectual genre)The Resource Category describes the genre, or the stereotypical
form of the intellectual content of the resource. Sample
genre include obituaries, biographies, and movie reviews. The
Resource Category values form a controlled vocabulary for the
prism:category element, defined by the PRISM
specification. The URI for the PRISM Resource Category vocabulary is: http://prismstandard.org/vocabularies/1.0/category.xml Some genre, such as maps or indices, strongly associate the
nature of the intellectual content and the style of presentation.
Those are only listed in Table 16: Controlled Vocabulary of
Presentation Styles Table 17: Categories (intellectual genre)
Appendix A: BibliographyPart 1: Normative
References [AAT] Getty Art and Architecture Thesarus. <http://shiva.pub.getty.edu/aat_browser/> [DCMI] Dublin Core Metadata Element Set, Version 1.1: Reference Description. http://purl.org/dc/documents/rec-dces-19990702.htm [DCMI-R] Relation Element Working Draft; Dublin Core Metadata
Initiative; 1997-12-19. [Dictionary.com] http://dictionary.com [IETF-MIMETYPES] Internet Assigned Numbers Authority (IANA);
Internet Media Types. [IETF-XML-Media] M. Murata, S. St.Laurent, D. Kohn;
XML Media Types; Jan. 2001. [IPTC-NEWSML] International Press and Telecommunications Council, NewsML Specification & Documents; http://www.iptc.org/site/NewsML/NewsMLSpec.htm [IPTC-NITF] International Press and Telecommunications Council,
News Industry Text Format. [ISO-639] ISO 639 - Codes for the representation of names of
languages. [ISO-3166] ISO 3166 - Codes for the representation of names of
countries and their subdivisions. [NAICS] North American Industry Classification System; 1997. http://www.census.gov/epcd/www/naics.html [RFC-3066] H. Alvestrand; Tags for the Identification of Languages; January 2001. http://www.ietf.org/rfc/rfc3066.txt [IETF-MediaTypes] N. Freed & N. Borenstein. November 1996, Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types. http://www.ietf.org/rfc/rfc2046.txt [RFC-2119] S. Bradner, Key words for use in RFCs to Indicate Requirement Level http://www.ietf.org/rfc/rfc2119.txt [RFC-2396] Uniform Resource Identifiers (URI): Generic Syntax, Internet RFC 2396. http://www.ietf.org/rfc/rfc2396.txt [TGN] Getty Thesaurus of Geographic Names. http://shiva.pub.getty.edu/tgn_browser/ [W3C-DateTime] Misha Wolf, Charles Wicksteed, Date and Time Formats W3C Note; http://www.w3.org/TR/NOTE-datetime-970915.html [W3C-RDF] Ora Lassila, Ralph R Swick, Resource Definition Framework (RDF) Model and Syntax Specification. http://www.w3.org/TR/REC-rdf-syntax [W3C-XML] Tim Bray, Jean Paoli, C. M. Sperberg-McQueen (eds.), Extensible Markup Language (XML) http://www.w3.org/TR/REC-xml [W3C-XML-BASE] Jonathan Marsh (ed.); XML Base; http://www.w3.org/TR/xmlbase/ [W3C-XML-NS] Tim Bray, Dave Hollander, Andrew Layman (eds.); Namespaces in XML. http://www.w3.org/TR/REC-xml-names Part 2: Non-Normative
References [ICE] The Information and Content Exchange (ICE)
Protocol. [ISO-8601] ISO (International Organization for Standardization), ISO 8601:1988 (E) Data elements and interchange formats - Information interchange - Representation of dates and times, 1998. http://www.iso.ch/cate/d15903.html [ISO-13250] ISO/IEC 13250 Topic Maps: Information Technology -- Document Description and Markup Languages. [TZ-LIB] Time Zone Library; ftp://elsie.nci.nih.gov/pub/ [W3C-RDFS] Dan Brickley, R.V. Guha (eds.), Resource Description Framework (RDF) Schema Specification 1.0, W3C Candidate Recommendation, 27 March 2000,http://www.w3.org/TR/2000/CR-rdf-schema-20000327 [W3C-SMIL] Synchronized Multimedia Integration Language (SMIL) 1.0 Specification (SMIL) http://www.w3.org/TR/Rec-SMIL [XrML] ContentGuard, Inc., Extensible Rights Markup Language. http://www.xrml.org/ [XTM] XTM: XML Topic Maps (XTM) 1.0: TopicMaps.Org Specification,;TopicMaps.Org XTM Authoring Group; 3 Mar 2001. http://www.topicmaps.org/xtm/1.0/ Note that all the identifiers in this extract from the exportable database are relative URIs. This implies that an xml:basewas made earlier in the file so that the URIs do not change depending on the systems containing the file. This is a subclass in the RDF Schema [W3C-RDF-Schema] sense of the term. This document does not cite the RDF Schema document in a normative way, since that document is not yet a full W3C Recommendation. However, once a full Recommendation is created, it is expected to define the subClass predicate so we go ahead and use that term in this section. For more on the RDF Schema relations of various PRISM terms, see Error! Reference source not found.. Implementers and users are advised not to use the © character entity to put copyright symbols ‘©‘ into copyright statements. Many XML parsers do not have that character entity predefined. Implementations should use the numeric character entity "©" instead. For details on the evaluation of the PRL rights expressions, see section 15.4 PRISM Rights Language. Sharp-eyed readers familiar with RDF may have noticed that the RDF subject of the releaseTime and expirationTime elements is not the Corfu photo, but an anonymous node. That is because those elements do not directly describe the photo. Instead, their interpretation is that the agreement governing the use of the photo imposes such a condition. This interpretation is also used in the geography, industrySector, and usage elements shown in the next example. That restriction is established by the use of the #none value in the first <prl:usage> element. Note that the new XML Base mechanism was used to abbreviate the full URI of #none. Not all RDF parsers will support the new XML Base standard, so it is safer not to use it. However, it makes the URIs and examples shorter, so we use it to simplify the exposition. Either the 1.0 version of the spec, or the subsequent cookbook, will contain a non-normative appendix with XSLT stylesheets for converting vocabularies using the PCV elements into ones that follow the XTM Topic Maps Specification. The current XTM spec does not comply with the 1.0 version of the RDF Syntax, but there is an obvious and simple mapping between the two syntaxes. Note that URI references include the forms commonly known as “relative URLs”, which allow considerable syntactic freedom. Therefore, almost all identifiers can fulfill the requirement to be a URI reference. Resolving such identifiers, of course, may require special handling. Dublin Core implementations based on relational databases typically find this condition to be surprising. Implementers are reminded that PRISM specifies a file format, and does not constrain what implementations do with that data. Early drafts of this specification assumed that people would not have ready access to RDF-parsing software, and attempted to reduce the complexity of the syntax generated. Since this project was begun, a number of freeware and commercial RDF parsers have become available, so we no longer make simplifications for that purpose. Actually, that practice is recommended, though not mandated. Much of the resilience and extensibility of the Domain Name System (DNS) has been attributed to its simple rule that if intermediate systems don’t understand a record, they just pass it on through. That rule lets up-to-date endpoints communicate without having all intermediate points updated. A validation tool based on XML Schemas has been developed. It will be available online from the prismstandard.org website. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PRISM is a member of , Solutions Through Standards |