[Mirrored from: http://www.loc.gov/marc/dcqualif.html]

Dublin Core Qualifiers/Substructure : a proposal

Rebecca Guenther

Last updated: 15 April 1997


The following proposes a core set of qualifiers for the Dublin Core element set. It attempts to find a middle ground between the "minimalists" (those wanting a minimum of DC qualifiers and substructure if any) and the "structuralists" (those wanting more precision in refinement of the elements for searching).

The document should be considered a "strawman" proposal to engender discussion about Dublin Core qualifiers/substructure. Participants at DC4 touched on some of these issues, but not in great detail. The concept of a registry was also discussed, which would need to be established and would maintain any qualifier list. In addition, the document provides a starting point in scoping the problem and could provide guidance for people who need to implement immediately. It could serve as a framework for those considering the implications of using Dublin Core style records in various syntaxes.

This document deals only with the qualifiers "scheme" and "type". A scheme qualifier is used to interpret the value in the content and is generally based on external standards. The type qualifier refines the definition of the data element itself.

Two principles were agreed upon at the DC4 meeting in Canberra relating to the type qualifier:

If a type qualifier does not meet these principles, then an extensibility mechanism may be used (i.e., indicate the extensible element set from which the qualifier came). In this case the metadata would not be regarded as falling within the Dublin Core standard.

Following is a list of qualifiers ("core qualifiers") which may provide an intermediate approach between the minimalist approach of using no or few qualifiers and the more complex approach in Dublin Core Qualifiers by Jon Knight and Martin Hamilton. The latter document has been heavily used, but the number of qualifiers is considerably lessened here. In many cases I have given rationales for why some were left off. Note that I have used one of the two recommended HTML syntaxes (the "dot" approach), but the types could represented by the alternative syntax discussed at DC4.

Note that the Knight/Hamilton document defined a scheme=USMARC. I have left that off entirely, because it is inappropriate. USMARC is a record syntax, a communication mechanism, and is not a standard for content. One uses other standards for the content of MARC records. A mapping between the elements with and without qualifiers provides the field/subfield tag names and they are not necessary as a scheme subelement.

Definitions are taken from the Weibel/Kunze/Lagoze RFC Dublin Core Metadata Element Set: Reference Description. The default for "scheme" is nothing (it is not controlled by any standard) and the default for "type" is indicated below. If the default is used, no qualifier is given.

Proposed Dublin Core Qualifiers/Subelements

1. Title
The name given to the resource by the CREATOR or PUBLISHER.
SCHEME: not necessary
Questions: Is alternative title needed since all titles would be searched the same way? Is it necessary to give guidance as to what to consider "main title" or "alternative title"?

2. Author or Creator
The person(s) or organization(s) primarily responsible for the intellectual content of the resource. For example, authors in the case of written documents, artists, photographers, or illustrators in the case of visual resources.

Rationale: All qualifiers listed in the Knight/Hamilton document extend the element rather than refine it (e.g., postal, phone, fax, affiliation, etc.). Only email is included here because it has been used frequently in current metadata projects. Email may be considered an alternative way of specifying the author. Affiliation and postal address, however, are supplementary information, not types. If other qualifiers are needed, they should be used as local extensions.

3. Subject and Keywords
The topic of the resource, or keywords or phrases that describe the subject or content of the resource. The intent of the specification of this element is to promote the use of controlled vocabularies and keywords. This element might well include scheme-qualified classification data (for example, Library of Congress Classification Numbers or Dewey Decimal numbers) or scheme-qualified controlled vocabularies (such as MEdical Subject Headings or Art and Architecture Thesaurus descriptors) as well.

The above are the most frequently used. The scheme could point to the controlled list maintained by the Library of Congress in the USMARC Code List for Relators, Sources, Description Conventions, which includes many others (Part III: Classification Sources and Part IV: Subject/Index Term Sources; these documents are currently under revision for a new edition).

TYPE: not necessary
Question: is LC Name Authority File needed as a scheme for names used as subjects?

4. Description
A textual description of the content of the resource, including abstracts in the case of document-like objects or content descriptions in the case of visual resources. Future metadata collections might well include computational content description (spectral analysis of a visual resource, for example) that may not be embeddable in current network systems. In such a case this field might contain a link to such a description rather than the description itself.

TYPE: not necessary (no need to distinguish "free text" from "abstract provided by author"?)

5. Publisher
The entity responsible for making the resource available in its present form, such as a publisher, a university department, or a corporate entity. The intent of specifying this field is to identify the entity that provides access to the resource.
SCHEME: not necessary
TYPE: Publisher.email
As with Author, the qualifiers listed in the Knight/Hamilton document extend rather than refine this element. Email has been included because of a demonstrated need. Use local qualifiers if others are needed.

6. Other Contributor
Person(s) or organization(s) in addition to those specified in the CREATOR element who have made significant intellectual contributions to the resource but whose contribution is secondary to the individuals or entities specifed in the CREATOR element (for example, editors, transcribers, illustrators, and convenors).

Same rationale for not including additional qualifiers as for Author.
At DC4 we decided role was not needed. It has thus not been included here. Is there a need to specify the contributor's relationship to the work? (At the Library of Congress, we have found that names are all searched in the same index and there is no need to further specify role in the work. We ceased to indicate this in bibliographic records a long time ago.)

7. Date
The date the resource was made available in its present form. The recommended best practice is an 8 digit number in the form YYYYMMDD as defined by ANSI X3.30-1985. In this scheme, the date element for the day this is written would be 19961203, or December 3, 1996. Many other schema are possible, but if used, they should be identified in an unambiguous manner.
NOTE that the DATE Subgroup of DC4 is working on changing this definition so that it is general enough to allow for qualifiers that refine rather than extend the definition. (Possible definition suggested: A point in time related to the document.)


I question whether the last two are necessary in the "core" list of qualifiers. Also, without a new definition it is hard to evaluate against the principle that it refines not extends. Note that this is preliminary pending the recommendations of the DATE working group.

8. Resource Type
The category of the resource, such as home page, novel, poem, working paper, technical report, essay, dictionary. It is expected that RESOURCE TYPE will be chosen from an enumerated list of types.
SCHEME: not necessary (but an enumerated list of types is)
TYPE: Type.Audience? (is this necessary?)
A controlled list of values is needed. Jon Knight's Dublin Core Standard Resource Types is a start, but in many cases categories are not mutually exclusive. In addition, in some cases it combines notions of genre, publishing patterns (preprint, unpublished), quality (referreed vs. non-referreed), and relations (InBook, InCollection).

9. Format
The data representation of the resource, such as text/html, ASCII, Postscript file, executable application, or JPEG image. The intent of specifying this element is to provide information necessary to allow people or machines to make decisions about the usability of the encoded data (what hardware and software might be required to display or execute it, for example). As with RESOURCE TYPE, FORMAT will be assigned from enumerated lists such as registered Internet Media Types (MIME types). In principal, formats can include physical media such as books, serials, or other non-electronic media.

TYPE: Not necessary

10. Resource Identifier
String or number used to uniquely identify the resource. Examples for networked resources include URLs and URNs (when implemented). Other globally-unique identifiers,such as International Standard Book Numbers (ISBN) or other formal names would also be candidates for this element.

TYPE: Not necessary

11. Source
The work, either print or electronic, from which this resource is derived, if applicable. For example, an html encoding of a Shakespearean sonnet might identify the paper version of the sonnet from which the electronic version was transcribed.



12. Language
Language(s) of the intellectual content of the resource. Where practical, the content of this field should coincide with the NISO Z39.53 three character codes for written languages.

(USMARC is not necessary, since it remains in sync with Z39.53)
TYPE: Not necessary
Note that the current ISO 639 (to become ISO 639-1 when ISO 639-2 3-character code is approved) covers only about 140 languages as compared with the draft standard ISO 639-2, which covers about 400 languages. (For instance, ISO 639-1 does not distinguish ancient languages from modern languages.) Many agencies will need the larger more extensive list that ISO 639-2 provides. Information about the ballotting of ISO 639-2 will be available shortly. The new standard is closer to Z39.53 than to ISO 639-1 (and Z39.53 will be revised to be in sync with ISO 639-2 once approved).

13. Relation
Relationship to other resources. The intent of specifying this element is to provide a means to express relationships among resources that have formal relationships to others, but exist as discrete resources themselves. For example, images in a document, chapters in a book, or items in a collection. A formal specification of RELATION is currently under development. Users and developers should understand that use of this element should be currently considered experimental.
NOTE that a subgroup was formed at DC4 to deal with Relation.

Rationale: Many others are mentioned in Knight/Hamilton document. Rationale for not including is as follows:
IsDerivedFrom: same as Source element
HasBibliographicInfoIn: using this element is not the mechanism for pointing to bibliographic information; more likely is the Warwick Framework (metadata packages in a container)
IsRevisionHistoryFor: extends, doesn't refine element
IsCriticalReviewFor: can use Abstract for
IsOverviewOf: can use Abstract for
IsContentRatingFor: use Warwick Framework/metadata packages
IsTermsAndConditionsFor: use Warwick Framework/metadata packages
IsDataFor: ?? can't see the usefulness of this
[Using the KIS principle-- Keep It Simple]

14. Coverage
The spatial locations and temporal durations characteristic of the resource. Formal specification of COVERAGE is currently under development. Users and developers should understand that use of this element should be currently considered experimental.
Note that this is preliminary pending the recommendations of the Coverage working group.
SCHEME: To be determined by Coverage Working Group
TYPE: Highly recommend that a type always be specified (but if not, read as free text)

More specific types to be determined by Coverage Working Group.

15. Rights Management
The content of this element is intended to be a link (a URL or other suitable URI as appropriate) to a copyright notice, a rights-management statement, or perhaps a server that would provide such information in a dynamic way. The intent of specifying this field is to allow providers a means to associate terms and conditions or copyright statements with a resource or collection of resources. No assumptions should be made by users if such a field is empty or not present.

TYPE: Not necessary