DDI Tag Library
This Tag Library describing the five main sections of the
Document Type Definition (DTD) for social science data
documentation developed by the Data Documentation Initiative (DDI)
Committee. These documents present English language
descriptions of XML (eXtensible Markup Language) DTD
elements and attributes and instructions
for their use as of Version 1 (Final) by Jerome McDonough, UC-Berkeley Library.
The following are the highest level components of any
document that will be marked up in compliance with this
DTD.
A graphical representation
of the document hierarchy is also available.
- Document Description
Items describing the marked-up document itself as well as its
source documents (citation, title, etc.)
Element -- optional, not repeatable.
- Study Description
Items describing the overall data collection (title, citation,
methodology, study scope, data access, etc.)
Element -- required, repeatable.
- Data Files Description
Items relating to the format, size, and structure
of the data files
Element -- optional, repeatable.
- Variables Description
Items relating to variables in the data collection
Element -- optional, repeatable.
- Other Study-Related Materials
Other study-related material not included in the
other sections (bibliography, separate questionnaire file, etc.)
Element -- optional, repeatable.
Document Description
(Codebook Header)
Document
|---DOCUMENT DESCRIPTION
|---Study Description
|---Data Files Description
|---Variables Description
|---Other Study-Related Materials
Role of the Document Description
The Document Description consists of bibliographic information
describing the DDI-compliant document itself as a whole. This Document
Description can be considered the wrapper or header whose elements
uniquely describe the full contents of the compliant DDI file. Since
the Document Description section is used to identify the DDI-compliant
file within an electronic resource discovery environment, this section
should be as complete as possible. The author in the Document
Description should be the individual(s) or organization(s) directly
responsible for the intellectual content of the DDI version, as
distinct from the person(s) or organization(s) responsible for the
intellectual content of the earlier paper or electronic edition from
which the DDI edition may have been derived. The producer in the
Document Description should be the agency or person that prepared the
marked-up document. Note that the Document
Description section contains a Documentation Source subsection (1.4)
consisting of information about the source of the DDI-compliant file--
that is, the hardcopy or electronic codebook that served as the source
for the marked-up codebook. These sections allow the creator of the
DDI file to produce version, responsibility, and other descriptions
relating to both the creation of that DDI file as a separate and
reformatted version of source materials (either
print or electronic) and the original source materials themselves.
To comply with the Dublin Core, it is recommended that the
following elements in the Document Description be used when the
appropriate information is available:
DUBLIN CORE DDI
------------------
Title 1.1.1.1 title (Title of Marked-up Document)
Creator 1.1.2.1 AuthEnty (Authoring Entity)
Publisher 1.1.3.1 producer (Producer)
[NOTE: The Dublin Core specifies that the
publisher should be "the entity
responsible for making the resource
available *in its present form*"
(emphasis added). For a DDI codebook
the publisher should be the entity
responsible for making the
*electronic* DDI version available.
Contributor 1.1.2.3 othId (Other Ident. & Acknowl.)
Date 1.1.3.3 prodDate (Date of Production)
[NOTE: The DC Date element
should refer to the date the
electronic resource (e.g., the DDI
version of the codebook) was created,
not any preceding paper version.]
Identifier Suggested DC Identifier: URL for DDI
Codebook, if applicable.
Alternatively, use the IDNo element
(1.1.1.5) within the Document Description
citation element.
Relation Partially maps to 1.4 docSrc (Documentation
Source). No mapping currently exists
for the relation type component.
Rights 1.1.3.2 copyright (Copyright)
Document Description
- <docDscr> 1.0
- Description: This section contains information about both the
document being created (the marked-up
document) and the source document (the electronic or print
codebook which is the source(s) of information),
if one exists. It addition, it provides information on how to use
the document contents and on the status of the
document itself. Although this element is optional, it is
strongly recommended that all marked-up documents
contain at minimum the following nested set of elements:
<docDscr> 1.0, <citation> 1.1, <titlStmt> 1.1.1,
and <titl> 1.1.1.1 (required).
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements: Citation -- Marked-up Document,
Guide to Documentation,
Documentation Status,
Documentation Source,
Notes (Document Description)
- Citation -- Marked-up Document
- <citation> 1.1 (Generic element A.6)
- Description: Citation for the marked-up
document. This element encodes the bibliographic information
describing the marked-up codebook, including title information,
statement of responsibility, production and distribution information,
series and version information, text of a preferred bibliographic
citation, and notes (if any).
A MARCURI attribute is provided to link to the MARC
record for this citation.
Remarks: Note that it is the elements within this
citation element that are the primary source for most generic search
engines through their relationship to the Dublin Core tags.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, MARCURI
- Contains Elements:
Title Statement -- Marked-up Document,
Responsibility Statement -- Marked-up Document,
Production Statement -- Marked-up Document,
Distributor Statement -- Marked-up Document,
Series Statement -- Marked-up Document,
Version Statement -- Marked-up Document,
Bibliographic Citation -- Marked-up Document,
Holdings Information -- Marked-up Document,
Notes (Citation) -- Marked-up Document
- Title Statement -- Marked-up Document
- <titlStmt> 1.1.1 (Generic element A.6.1)
- Description: Title statement for the
marked-up document.
- Required
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
Title -- Marked-up Document,
Subtitle -- Marked-up Document,
Alternative Title -- Marked-up Document,
Parallel Title -- Marked-up Document,
ID Number -- Marked-up Document
- Title -- Marked-up Document
- <titl> 1.1.1.1 (Generic element A.6.1.1)
- Description: Contains the full authoritative title of the
marked-up codebook. The marked-up codebook title will in most cases be
identical to the title for the data collection (2.1.1). A full title
should indicate the geographic scope of the data collection as well as
the time period covered. Equivalent to Dublin Core Title.
- Examples:
<titl>Domestic Violence Experience in Omaha, Nebraska,
1986-1987</titl>
<titl>Census of Population, 1950 [United States]: Public Use
Microdata Sample</titl>
<titl>Monitoring the Future: A Continuing Study of American
Youth, 1995</titl>
- Required
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Subtitle -- Marked-up Document
- <subTitl> 1.1.1.2 (Generic element A.6.1.2)
- Description: A subtitle is a
secondary title used to amplify or state certain limitations of the
main title. It may repeat information already in the main title.
- Examples:
<titl>Monitoring the Future: A Continuing Study of American
Youth, 1995</titl>
<subTitl>A Continuing Study of American Youth, 1995</subTitl>
<titl>Census of Population, 1950 [United States]: Public Use
Microdata Sample</titl> <subTitl>Public Use
Microdata Sample</subTitl>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Alternative Title -- Marked-up Document
- <altTitl> 1.1.1.3 (Generic element A.6.1.3)
- Description: The alternative title may be the title by which a data collection is commonly referred to or it may be an abbreviation for the title.
- Examples:
<titl>Census of Population, 1950 [United States]: Public Use
Microdata Sample</titl>
<altTitl>PUMS</altTitl>
<titl>Equality of Educational Opportunity (Coleman) Study
(EEOS), 1996</titl>
<altTitl>The Coleman Study</altTitl>
<altTitl>EEOS</altTitl>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Parallel Title -- Marked-up Document
- <parTitl> 1.1.1.4 (Generic element A.6.1.4)
- Description: Title translated into another language.
- Example:
<titl>Politbarometer West [Germany], Partial
Accumulation, 1977-1995</titl>
<parTitl>Politbarometer, 1977-1995: Partielle Kumulation</parTitl>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- ID Number -- Marked-up Document
- <IDNo> 1.1.1.5 (Generic element A.6.1.5)
- Description: Unique string or number
(producer's or archive's number) for the marked-up
document. An "agency" attribute is supplied. Equivalent to Dublin Core Identifier.
- Examples:
<IDNo agency='ICPSR'>6678</IDNo>
<IDNo agency='ZA'>2010</IDNo>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, agency
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Responsibility Statement -- Marked-up
Document
- <rspStmt> 1.1.2 (Generic Element A.6.2)
- Description: Responsibility for the
creation of the marked-up codebook.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
Authoring Entity / Primary Investigator -- Marked-up Document,
Other Identifications / Acknowledgments -- Marked-up Document
- Authoring Entity / Primary Investigator -- Marked-up Document
- <AuthEnty> 1.1.2.1 (Generic element A.6.2.1)
- Description: The person, corporate
body, or agency responsible for the marked-up document's substantive
and intellectual content. Usually the same as the authoring entity
responsible for the data collection (2.1.2.1). Repeat the element for each
author, and use the affiliation attribute if available. Invert first and
last name and use commas. Equivalent to Dublin Core Creator.
Remarks: The author in the Document
Description should be the individual(s) or organization(s) directly
responsible for the intellectual content of the DDI version, as
distinct from the person(s) or organization(s) responsible for the
intellectual content of the earlier paper or electronic edition from
which the DDI edition may have been derived. The producer (1.1.3.1) in the
Document Description should be the agency or person that prepared the
marked-up document.
- Examples:
<AuthEnty>United States Department of Commerce. Bureau of the Census</AuthEnty>
<AuthEnty affiliation='European Commission'>Rabier, Jacques-Rene</AuthEnty>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, affiliation
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Other Identifications / Acknowledgments -- Marked-up Document
- <othId> 1.1.2.2 (Generic element A.6.2.2)
- Description: Statements of
responsibility not recorded in the title and statement of
responsibility areas. Indicate here the persons or bodies connected
with the work, or significant persons or bodies connected with
previous editions and not already named in the description. For
example, the name of the person who edited the marked-up documentation might
be cited here, using the role and affiliation attributes.
Remarks: The paragraph tag <p> must be used in this element.
- Example:
<othId role='editor' affiliation='INRA'><p>Jane Smith</p></othId>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, type, role, affiliation
- Contains: <p>, othId
- Production Statement -- Marked-up Document
- <prodStmt> 1.1.3 (Generic element A.6.3)
- Description: Production statement for the marked-up document.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
Producer -- Marked-up Document,
Copyright -- Marked-up Document,
Date of Production -- Marked-up Document,
Place of Production -- Marked-up Document,
Software Used in Production -- Marked-up Document,
Funding Agency -- Marked-up Document,
Grant Number -- Marked-up Document
- Producer -- Marked-up Document
- <producer> 1.1.3.1 (Generic element A.6.3.1)
- Description: The producer of the marked-up
document is the person or organization with the financial or
administrative responsibility for the physical processes whereby the
marked-up document was brought into existence. Use the role attribute
to distinguish different stages of involvement in the production
process, such as original producer. Equivalent to Dublin Core
Publisher.
- Example:
<producer abbr='ICPSR' affiliation='Institute for Social Research'>Inter-university Consortium for Political and Social Research</producer>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, abbr, affiliation, role
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Copyright -- Marked-up Document
- <copyright> 1.1.3.2 (Generic element A.6.3.2)
- Description: Copyright statement for
the marked-up document. Equivalent to Dublin Core Rights.
- Example:
<copyright>Copyright(c) ICPSR, 2000</copyright>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Date of Production -- Marked-up Document
- <prodDate> 1.1.3.3 (Generic element A.6.3.3)
- Description: Date the marked-up
document was produced (not distributed or archived).
The ISO
standard for dates (YYYY-MM-DD) is recommended for use with the date
attribute. Equivalent to Dublin Core Date.
- Example:
<prodDate date='1999-01-25'>January 25, 1999</prodDate>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, date
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Place of Production -- Marked-up Document
- <prodPlac> 1.1.3.4 (Generic element A.6.3.4)
- Description: Address of the archive
or agency that produced the marked-up document.
- Example:
<prodPlac>Ann Arbor, MI: Inter-university Consortium for Political and Social Research</prodPlac>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Software Used in Production -- Marked-up Document
- <software> 1.1.3.5 (Generic element A.6.3.5)
- Description: Software used to produce
the marked-up document. A "version" attribute permits specification of the software
version number. The "date" attribute is provided to enable specification
of the date (if any) for the software release. The ISO
standard for dates (YYYY-MM-DD) is recommended for use with the date
attribute.
- Examples:
<software version='1.0'>MRDC Codebook Authoring Tool</software>
<software version='8.0'>Arbortext Adept Editor</software>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, version, date
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Funding Agency -- Marked-up Document
- <fundAg> 1.1.3.6 (Generic element A.6.3.6)
- Description: The source(s) of funds
for production of the marked-up document. If different funding
agencies sponsored different stages of the production process, use the
role attribute to distinguish them.
- Examples:
<fundAg abbr='NSF' role="infrastructure">National Science Foundation</fundAg>
<fundAg abbr='SUN' role="equipment">Sun Microsystems</fundAg>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, abbr, role
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Grant Number -- Marked-up Document
- <grantNo> 1.1.3.7 (Generic element A.6.3.7)
- Description: The grant/contract
number of the project that sponsored the markup effort. If more
than one, indicate the appropriate agency using the "agency"
attribute. If different funding
agencies sponsored different stages of the production process, use the
role attribute to distinguish the grant numbers.
- Example:
<grantNo agency='Bureau of Justice Statistics'>J-LEAA-018-77</grantNo>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, agency, role
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Distributor Statement -- Marked-up Document
- <distStmt> 1.1.4 (Generic element A.6.4)
- Description: Distribution statement for the marked-up document.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
Distributor -- Marked-up Document,
Contact Person -- Marked-up Document,
Depositor -- Marked-up Document,
Date of Deposit -- Marked-up Document,
Date of Distribution -- Marked-up Document
- Distributor -- Marked-up Document
- <distrbtr> 1.1.4.1 (Generic element A.6.4.1)
- Description: The organization
designated by the author or producer to generate copies of particular
marked-up documentation including any necessary editions or
revisions. Names and addresses
may be specified and other archives may be co-distributors. A URI
attribute is included to provide an URN or URL to the ordering service
or download facility on a website.
- Example:
<distrbtr abbr='ICPSR' affiliation='Institute for
Social Research' URI='http://www.icpsr.umich.edu'>Ann Arbor, MI: Inter-university Consortium for
Politcal and Social Research</distrbtr>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, abbr, affiliation, URI
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Contact Person -- Marked-up Document
- <contact> 1.1.4.2 (Generic element A.6.4.2)
- Description: Names and addresses of
individuals responsible for the marked-up document. Individuals listed
as contact persons will be used as resource persons regarding problems
or questions raised by the user community. The URI attribute should be
used to indicate a URN or URL for the homepage of the contact
individual. The email attribute is used to indicate an email address
for the contact individual.
- Example:
<contact affiliation='University of Wisconsin' email='jsmith@...>Jane Smith</contact>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, affiliation, URI, email
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Depositor -- Marked-up Document
- <depositr> 1.1.4.3 (Generic element A.6.4.3)
- Description: The name of the person
(or institution) who provided this marked-up documentation to the
archive storing it.
- Example:
<depositr abbr='BJS' affiliation='U.S. Department of Justice'>Bureau of Justice Statistics</depositr>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, abbr, affiliation
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Date of Deposit -- Marked-up Document
- <depDate> 1.1.4.4 (Generic element A.6.4.4)
- Description: The date that the
marked-up document was deposited with the archive that originally
received it. The ISO
standard for dates (YYYY-MM-DD) is recommended for use with the date
attribute.
- Example:
<depDate date='1999-01-25'>January 25, 1999</depDate>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, date
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Date of Distribution -- Marked-up Document
- <distDate> 1.1.4.5 (Generic element A.6.4.5)
- Description: Date that the marked-up
document was made available for distribution/presentation. The ISO
standard for dates (YYYY-MM-DD) is recommended for use with the date
attribute.
- Example:
<distDate date='1999-01-25'>January 25, 1999</distDate>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, date
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Series Statement -- Marked-up Document
- <serStmt> 1.1.5 (Generic element A.6.5)
- Description: Series statement for the
marked-up document. The URI attribute is provided to point to a central
Internet repository of series information.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, URI
- Contains Elements:
Series Name -- Marked-up Document,
Series Information -- Marked-up Document
- Series Name -- Marked-up Document
- <serName> 1.1.5.1 (Generic element A.6.5.1)
- Description: The name of the series
to which the marked-up document belongs. This will probably be the same as
the Series Name for the study or data collection (2.1.5.1).
- Example:
<serName abbr='CPS'>Current Population Survey Series</serName>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, abbr
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Series Information -- Marked-up Document
- <serInfo> 1.1.5.2 (Generic element A.6.5.2)
- Description: Contains a history of
the series and a summary of those features that apply to the series as
a whole. This will
probably be the same as the Series Information for the study or data
collection (2.1.5.2).
- Example:
<serInfo>The Current Population Survey (CPS)
is a household sample survey conducted monthly by the Census Bureau to
provide estimates of employment, unemployment, and other characteristics
of the general labor force, estimates of the population as a whole,
and estimates of various subgroups in the population. The entire
non-institutionalized population of the United States is sampled to
obtain the respondents for this survey series.</serInfo>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Version Statement -- Marked-up Document
- <verStmt> 1.1.6 (Generic element A.6.6)
- Description: Version statement for the marked-up document.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
Version -- Marked-up Document,
Version Responsibility Statement -- Marked-up Document,
Notes (Version) -- Marked-up Document
- Version -- Marked-up Document
- <version> 1.1.6.1 (Generic element A.6.6.1)
- Description: Also known as release or
edition. If there have been substantive changes in the marked-up
document since its creation, this statement should be used. The ISO
standard for dates (YYYY-MM-DD) is recommended for use with the date
attribute.
Remarks: ICPSR distinguishes among the terms "release," "version," and
"edition" in the following ways:
- ICPSR Edition: Used only
for intensively processed collections, for which ICPSR has produced a
unique edition of the data. This usually involves checking for
undocumented codes and consistency checks. Signals that additional
intellectual effort has gone into producing the collection.
- ICPSR Version: Used to indicate that
ICPSR has revised the format of a collection or added components
to it, in most cases without
changing any data values. A study is considered an "ICPSR version"
if one or more of these steps has been performed:
(1) Converting software-specific system files or export/transport
files to raw data;
(2) Generating SAS and/or SPSS data definition statements;
(3) Reformatting files, e.g., removing blanks to use space more
efficiently;
(4)Scanning hardcopy documentation; or
(5)Reformatting machine-readable documentation, e.g., converting
text created in a word-processing package to ASCII text.
- Release: Used for data collections that are
being disseminated exactly as they came from the data depositor
(except for the addition of an ICPSR cover and ICPSR front matter).
- Example:
<version type='edition' date='1999-01-25'>Second ICPSR Edition</version>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type (release, version, edition), date
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Version Responsibility Statement -- Marked-up Document
- <verResp> 1.1.6.2 (Generic element A.6.6.2)
- Description: Used to indicate the
organization or person responsible for the version of the marked-up
document.
- Example:
<verResp>Zentralarchiv fuer Empirische Sozialforschung</verResp>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, affiliation
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Notes (Version) -- Marked-up Document
- <notes> 1.1.6.3 (Generic element A.4)
- Description: Used to indicate
additional information regarding the version or the version
responsibility statement for the marked-up document, in particular to indicate what makes a new
version different from its predecessor. "Notes" sections appear in
several places in the DTD. The attributes for notes permit a
controlled vocabulary to be developed (type and subject), the level of
the DTD to which the note refers to be identified (study, file,
variable, etc.), and the author of the note to be indicated
(resp).
- Example:
<notes resp='Jane Smith'>Additional information on derived variables
has been added to this marked-up version of the documentation.</notes>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type, subject level, resp
- Contains: #PCDATA, Link to other
element(s) within the codebook, reference to a table.
- Bibliographic Citation -- Marked-up Document
- <biblCit> 1.1.7 (Generic element A.6.7)
- Description: Complete bibliographic
reference containing all of the standard elements of a citation that
can be used to cite the marked-up document. The "format" attribute is
provided to enable specification of the particular citation style
used, e.g. APA, MLA, Chicago, etc.
- Example:
<biblCit format='MRDF'>Rabier, Jacques-Rene, and Ronald
Inglehart. EURO-BAROMETER 11: YEAR OF THE CHILD IN EUROPE, APRIL 1979
[Codebook file]. Conducted by Institut Francais D'Opinion Publique
(IFOP), Paris, et al. ICPSR ed. Ann Arbor, MI: Inter-university
Consortium for Political and Social Resarch [producer and
distributor], 1981. </biblCit>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, format
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Holdings Information -- Marked-up Document
- <holdings> 1.1.8 ((Generic element A.6.8)
- Description: Information concerning
either the physical or electronic holdings of the cited work. Attributes
include: location--The physical location where a copy is held;
callno--The call number for a work at the location specified; and
URI--A URN or URL for accessing the electronic copy of the cited
work.
- Example:
<holdings location='ICPSR DDI Repository' callno='inap.'
URI='http://www.icpsr.umich.edu/DDIrepository/'>
Marked-up Codebook for Current Population Survey, 1999: Annual Demographic
File</holdings>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, location, callno, URI
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Notes (Citation) -- Marked-up Document
- <notes> 1.1.9 (Generic element A.4)
- Description: Used to indicate
additional information regarding the citation for the marked-up
document. "Notes" sections appear
in several places in the DTD. The attributes for notes permit a
controlled vocabulary to be developed (type and subject), the level of
the DTD to which the note refers to be identified (study, file,
variable, etc.), and the author of the note to be indicated
(resp).
- Example:
<notes resp='Jane Smith'>This citation was
prepared by the archive based on information received from the markup
authors.</notes>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type, subject level, resp
- Contains: #PCDATA, Link to other
element(s) within the codebook, reference to a table.
- Guide to the Documentation -- Marked-up Document
- <guide> 1.2
- Description: List of terms and definitions used in the document. Provided to assist users in using the document correctly. For further examples, see the Codebook Information section of any of the printed, bound codebooks distributed by ICPSR.
- Example:
<guide>Metro Area OR Twin Cities =
Minneapolis/St. Paul MSA; Greater MN = All Minnesota Counties not
included in the Minneapolis/St. Paul MSA; The Range = Upper Northeast
quadrant of Minnesota traditionally associated with iron ore and
taconite mining.</guide>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Documentation Status -- Marked-up Document
- <docStatus> 1.3
- Description: Use this field to
indicate if the document is being presented/distributed before it has
been finalized. Some data producers and social science data archives
employ data processing strategies that provide for release of data and
documentation at various stages of processing.
- Example:
<docStatus>This marked-up document includes a provisional data
dictionary and brief citation only for the purpose of providing basic
access to the data file. A complete codebook will be published at a
later date.</docStatus>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Documentation Source
- <docSrc> 1.4 (Generic element A.6)
- Description: Citation for the source
document. This element encodes the bibliographic information
describing the source codebook, including title information, statement
of responsibility, production and distribution information, series and
version information, text of a preferred bibliographic citation, and
notes (if any). Information for this section should be taken directly
from the source document whenever possible. If additional information
is obtained and entered in the elements within this section, the
source of this information should be noted in the source attribute of
the particular element tag.
A MARCURI attribute is provided to link to the MARC
record for this citation.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, MARCURI
- Contains Elements:
Title Statement -- Source Document,
Responsibility Statement -- Source Document,
Production Statement -- Source Document,
Distributor Statement -- Source Document,
Series Statement -- Source Document,
Version Statement -- Source Document,
Bibliographic Citation -- Source Document,
Holdings Information -- Source Document,
Notes (Version) -- Source Document
- Title Statement -- Source Document
- <titlStmt> 1.4.1 (Generic element A.6.1)
- Description: Title statement for the
source document.
- Required
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
Title -- Source Document,
Subtitle -- Source Document,
Alternative Title -- Source Document,
Parallel Title -- Source Document,
ID Number -- Source Document
- Title -- Source Document
- <titl> 1.4.1.1 (Generic element A.6.1.1)
- Description: Contains the full authoritative title of the
source document. The source document title will in many cases be
identical to the title for the marked-up document. If the source
document contains no title, the title provided in this element should
indicate the geographic scope of the data collection as well as the
time period covered.
- Examples:
<titl>Domestic Violence Experience in Omaha, Nebraska, 1986-1987</titl>
<titl>Census of Population, 1950 [United States]: Public Use
Microdata Sample</titl>
<titl>Monitoring the Future: A Continuing Study of American Youth, 1995</titl>
- Required
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Subtitle -- Source Document
- <subTitl> 1.4.1.2 (Generic element A.6.1.2)
- Description: A subtitle is a secondary title used to amplify or state certain limitations of the main title. It may repeat information already in the main title.
- Examples:
<titl>Monitoring the Future: A Continuing Study of American Youth, 1995</titl>
<subTitl>A Continuing Study of American Youth, 1995</subTitl>
<titl>Census of Population, 1950 [United States]: Public Use
Microdata Sample</titl>
<subTitl>Public Use Microdata Sample</subTitl>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Alternative Title -- Source Document
- <altTitl> 1.4.1.3 (Generic element A.6.1.3)
- Description: The alternative title
may be the title by which a data collection is commonly referred to or
it may be an abbreviation for the title.
- Examples:
<titl>Census of Population, 1950 [United States]: Public Use
Microdata Sample</titl>
<altTitl>PUMS</altTitl>
<titl>Equality of Educational Opportunity (Coleman) Study
(EEOS), 1996</titl>
<altTitl>The Coleman Study</altTitl>
<altTitl>EEOS</altTitl>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Parallel Title -- Source Document
- <parTitl> 1.4.1.4 (Generic element A.6.1.4)
- Description: Title translated into another language.
- Example:
<titl>Politbarometer West [Germany], Partial
Accumulation, 1977-1995</titl>
<parTitl>Politbarometer, 1977-1995: Partielle Kumulation</parTitl>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- ID Number -- Source Document
- <IDNo> 1.4.1.5 (Generic element A.6.1.5)
- Description: Unique string or number
(producer's or archive's number) for the source document. An "agency"
attribute is supplied.
- Examples:
<IDNo agency='ICPSR'>6678</IDNo>
<IDNo agency='ZA'>2010</IDNo>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, agency
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Responsibility Statement -- Source Document
- <rspStmt> 1.4.2 (Generic element A.6.2)
- Description: Responsibility for the creation of the source document.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
Authoring Entity / Primary Investigator -- Source Document,
Other Identifications / Acknowledgments -- Source Document,
- Authoring Entity / Primary Investigator -- Source Document
- <AuthEnty> 1.4.2.1 (Generic element A.6.2.1)
- Description: The person, corporate
body, or agency responsible for the source document's substantive and
intellectual content. Usually the same as the authoring entity
responsible for the data collection (2.1.2.1). Repeat the element for each
author, and use the affiliation attribute if available. Invert first and
last name and use commas.
Remarks: The author in this element
should be the individual(s) or organization(s) directly
responsible for the intellectual content of the source document, as
distinct from the person(s) or organization(s) responsible for the
intellectual content of the marked-up document.
- Examples:
<AuthEnty>United States Department of Commerce. Bureau of the Census</AuthEnty>
<AuthEnty affiliation='European Commission'>Rabier, Jacques-Rene</AuthEnty>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, affiliation
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Other Identifications / Acknowledgments -- Source Document
- <othId> 1.4.2.2 (Generic element A.6.2.2)
- Description: Statements of
responsibility not recorded in the title and statement of
responsibility areas. Indicate here the persons or bodies connected
with the work, or significant persons or bodies connected with
previous editions and not already named in the description. For
example, the name of the person who edited the source document might
be cited here, using the role and affiliation attributes.
Remarks: The paragraph tag <p> must be used in this element.
- Example:
<othId role='editor' affiliation='INRA'><p>Jane Smith</p></othId>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, type, role, affiliation
- Contains: <p>, othId
- Production Statement -- Source Document
- <prodStmt> 1.4.3 (Generic element A.6.3)
- Description: Production statement for
the source document.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
Producer -- Source Document,
Copyright -- Source Document,
Date of Production -- Source Document,
Place of Production -- Source Document,
Software Used in Production -- Source Document,
Funding Agency -- Source Document,
Grant Number -- Source Document
- Producer -- Source Document
- <producer> 1.4.3.1 (Generic element A.6.3.1)
- Description: The producer of the
source document is the person or organization with the financial or
administrative responsibility for the physical processes whereby the
source document was brought into existence. Use the role attribute to
distinguish different stages of involvement in the production process,
such as original producer.
- Example:
<producer abbr='MNPoll' affiliation='Minneapolis Star
Tibune Newspaper' role = 'original producer'>Star Tribune Minnesota
Poll</producer> <producer abbr='MRDC' affiliation='University
of Minnesota' role = 'final production'>Machine Readable Data
Center</producer>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, abbr, affiliation, role
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Copyright -- Source Document
- <copyright> 1.4.3.2 (Generic element A.6.3.2)
- Description: Copyright statement for the source document.
- Example:
<copyright>Copyright(c) ICPSR, 2000</copyright>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Date of Production -- Source Document
- <prodDate> 1.4.3.3 (Generic element A.6.3.3)
- Description: Date the source document
was produced (not distributed or archived). The ISO standard for dates
(YYYY-MM-DD) is recommended for use with the date attribute.
- Example:
<prodDate date='1999-01-25'>January 25, 1999</prodDate>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, date
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Place of Production -- Source Document
- <prodPlac> 1.4.3.4 (Generic element A.6.3.4)
- Description: Address of the archive
or agency that produced the source document.
- Example:
<prodPlac>Ann Arbor, MI: Inter-university Consortium for Political and Social Research</prodPlac>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Software Used in Production -- Source Document
- <software> 1.4.3.5 (Generic element A.6.3.5)
- Description: Identifies the software
used in creating or storing the source document. A "version" attribute
permits specification of the software version number. The "date"
attribute is provided to enable specification of the date (if any) for
the software release. The ISO standard for dates (YYYY-MM-DD) is
recommended for use with the date attribute.
- Example:
<software version='4.0'>PageMaker</software>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, version, date
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Funding Agency -- Source Document
- <fundAg> 1.4.3.6 (Generic element A.6.3.6)
- Description: The source(s) of funds
for production of the source document. If different funding agencies
sponsored different stages of the production process, use the role
attribute to distinguish them.
- Example:
<fundAg abbr='NSF'>National Science Foundation</fundAg>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, abbr, role
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Grant Number -- Source Document
- <grantNo> 1.4.3.7 (Generic element A.6.3.7)
- Description: The grant/contract
number of the project that sponsored the documentation effort. If more
than one, indicate the appropriate agency using the "agency"
attribute. If different funding agencies
sponsored different stages of the production process, use the role
attribute to distinguish the grant numbers.
- Example:
<grantNo agency='Bureau of Justice Statistics'>J-LEAA-018-77</grantNo>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, agency, role
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Distributor Statement -- Source Document
- <distStmt> 1.4.4 (Generic element A.6.4)
- Description: Distribution statement for the source document.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
Distributor -- Source Document,
Contact Person -- Source Document,
Depositor -- Source Document,
Date of Deposit -- Source Document,
Date of Distribution -- Source Document
- Distributor -- Source Document
- <distrbtr> 1.4.4.1 (Generic element A.6.4.1)
- Description: The organization
designated by the author or producer to generate copies of a
particular source document including any necessary editions or
revisions. Distributor of the source document. Names and addresses may
be specified, and other archives may be co-distributors. A URI
attribute is included to provide an URN or URL to the ordering service
or download facility on a website.
- Example:
<distrbtr abbr='ICPSR" affiliation='Institute for
Social Research' URI='http://www.icpsr.umich.edu'>Ann Arbor, MI: Inter-university Consortium for
Political and Social Research</distrbtr>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, abbr, affiliation, URI
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Contact Person -- Source Document
- <contact> 1.4.4.2 (Generic element A.6.4.2)
- Description: Names and addresses of
individuals responsible for the source document. May be
PIs. Individuals listed as contact persons will be used as resource
persons regarding problems or questions raised by the user
community. The URI attribute should be used to indicate a URN or URL
for the homepage of the contact individual. The email attribute is
used to indicate an email address for the contact individual.
- Example:
<contact affiliation='University of Wisconsin' email='jsmith@uwisc.edu'>Jane Smith</contact>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, affiliation, URI, email
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Depositor -- Source Document
- <depositr> 1.4.4.3 (Generic element A.6.4.3)
- Description: The name of the person (or institution) who provided this source document to the archive storing it.
- Example:
<depositr abbr='BJS' affiliation='U.S. Department of Justice'>Bureau of Justice Statistics</depositr>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, abbr, affiliation
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Date of Deposit -- Source Document
- <depDate> 1.4.4.4 (Generic element A.6.4.4)
- Description: The date that the source
document was deposited with the archive that originally received
it. The ISO standard for dates (YYYY-MM-DD) is recommended for use
with the date attribute.
- Example:
<depDate date='1999-01-25'>January 25, 1999</depDate>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, date
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Date of Distribution -- Source Document
- <distDate> 1.4.4.5 (Generic element A.6.4.5)
- Description: The date
that the source document was released for distribution. The ISO
standard for dates (YYYY-MM-DD) is recommended for use
with the date attribute.
- Example:
<distDate date='1999-01-25'>January 25, 1999</distDate>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, date
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Series Statement -- Source Document
- <serStmt> 1.4.5 (Generic element A.6.5)
- Description: Series statement for the
source document. The URI attribute is provided to point to a central
Internet repository of series information.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, URI
- Contains Elements:
Series Name -- Source Document,
Series Information -- Source Document
- Series Name -- Source Document
- <serName> 1.4.5.1 (Generic element A.6.5.1)
- Description: The name of the data
series to which the source document belongs. This will probably be the same as
the Series Name for the study or data collection (2.1.5.1).
- Example:
<serName abbr='CPS'>Current Population Survey Series</serName>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, abbr
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Series Information -- Source Document
- <serInfo> 1.4.5.2 (Generic element A.6.5.2)
- Description: Contains a history of
the data series and a summary of those features that apply to the
series as a whole. This will
probably be the same as the Series Information for the study or data
collection (2.1.5.2).
- Example:
<serInfo>The Current Population Survey (CPS)
is a household sample survey conducted monthly by the Census Bureau to
provide estimates of employment, unemployment, and other charcteristics
of the general labor force, estimates of the population as a whole,
and estimates of various subgroups in the population. The entire
non-institutionalized population of the United States is sampled to
obtain the respondents for this survey series.</serInfo>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Version Statement -- Source Document
- <verStmt> 1.4.6 (Generic element A.6.6)
- Description: Version statement for
the source document.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
Version -- Source Document,
Version Responsibility Statement -- Source Document,
Notes (Version) -- Source Document
- Version -- Source Document
- <version> 1.4.6.1 (Generic element A.6.6.1)
- Description: Also known as release or
edition. If there have been substantive changes in the source document
since its creation, this statement should be used. The ISO standard
for dates (YYYY-MM-DD) is recommended for use with the date
attribute.
Remarks: ICPSR distinguishes among the terms "release," "version," and
"edition" in the following ways:
- ICPSR Edition: Used only
for intensively processed collections, for which ICPSR has produced a
unique edition of the data. This usually involves checking for
undocumented codes and consistency checks. Signals that additional
intellectual effort has gone into producing the collection.
- ICPSR Version: Used to indicate that
ICPSR has revised the format of a collection or added components
to it, in most cases without
changing any data values. A study is considered an "ICPSR version"
if one or more of these steps has been performed:
(1) Converting software-specific system files or export/transport
files to raw data;
(2) Generating SAS and/or SPSS data definition statements;
(3) Reformatting files, e.g., removing blanks to use space more
efficiently;
(4)Scanning hardcopy documentation; or
(5)Reformatting machine-readable documentation, e.g., converting
text created in a word-processing package to ASCII text.
- Release: Used for data collections that are
being disseminated exactly as they came from the data depositor
(except for the addition of an ICPSR cover and ICPSR front matter).
- Example:
<version type='edition' date='1999-01-25'>Second ICPSR Edition</version>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type (release, version, edition), date
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Version Responsibility Statement -- Source Document
- <verResp> 1.4.6.2 (Generic element A.6.6.2)
- Description: Used to indicate the
organization or person responsible for the version of the source
document.
- Example:
<verResp>Zentralarchiv fuer Empirische Sozialforschung</verResp>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, affiliation
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Notes (Version) -- Source Document
- <notes> 1.4.6.3 (Generic element A.4)
- Description: Used to indicate additional information regarding the version or the version responsibility statement, in particular to indicate what makes a new version different from its predecessor. "Notes" sections appear in several places in the DTD. The attributes for notes permit a controlled vocabulary to be developed (type and subject), the level of the DTD to which the note refers to be identified (study, file, variable, etc.), and the author of the note to be indicated (resp).
- Example:
<notes resp='Jane Smith'>The source codebook was produced from
original hardcopy materials using
Optical Character Recognition (OCR).</notes>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type, subject level, resp
- Contains: #PCDATA, Link to other
element(s) within the codebook, reference to a table.
- Bibliographic Citation -- Source Document
- <biblCit> 1.4.7 (Generic element A.6.7)
- Description: Complete bibliographic reference containing all of the standard elements of a citation that can be used to cite the source document. The "format" attribute is provided to enable specification of the particular citation style used, e.g. APA, MLA, Chicago, etc.
- Example:
<biblCit format='MRDF'>Rabier, Jacques-Rene, and Ronald
Inglehart. EURO-BAROMETER 11: YEAR OF THE CHILD IN EUROPE, APRIL 1979
[Computer file]. Conducted by Institut Francais D'Opinion Publique
(IFOP), Paris, et al. ICPSR ed. Ann Arbor, MI: Inter-university
Consortium for Political and Social Research [producer and
distributor], 1981. </biblCit>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, format
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Holdings Information -- Source Document
- <holdings> 1.4.8 (Generic element A.6.8)
- Description: Information concerning
either the physical or electronic holdings of the cited work. Attributes
include: location--The physical location where a copy is held;
callno--The call number for a work at the location specified; and
URI--A URN or URL for accessing the electronic copy of the cited
work.
- Example:
<holdings location='University of Michigan Graduate Library' callno='inap.'
URI='http://www.umich.edu/library/'>
Codebook for Current Population Survey, 1999: Annual Demographic File
</holdings>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, format, location, callno, URI
- Contains: #PCDATA, Link to other
element(s) within the codebook.
- Notes -- Source Document
- <notes> 1.4.9 (Generic element A.4)
- Description: Used to indicate
additional information about the source document. "Notes"
sections appear in several places in the DTD. The attributes for notes
permit a controlled vocabulary to be developed (type and subject), the
level of the DTD to which the note refers to be identified (study,
file, variable, etc.), and the author of the note to be indicated
(resp).
- Example:
<notes resp='Jane Smith'>A machine-readable version of the source
codebook was supplied by the Zentralarchiv.</notes>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type, subject level, resp
- Contains: #PCDATA, Link to other
element(s) within the codebook, reference to a table.
- Notes -- Document Description
- <notes> 1.5 (Generic element A.4)
- Description: Used to indicate
additional information about the document description as a
whole. "Notes" sections appear in several places in the DTD. The
attributes for notes permit a controlled vocabulary to be developed
(type and subject), the level of the DTD to which the note refers to
be identified (study, file, variable, etc.), and the author of the
note to be indicated (resp).
- Example:
<notes>This Document Description, or header information, can be used
within an electronic resource discovery environment.</notes>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type, subject, level, resp
- Contains: #PCDATA, Link to other
element(s) within the codebook, reference to a table.
Study Description
Document
|---Document Description
|---STUDY DESCRIPTION
|---Data Files Description
|---Variable Description
|---Other Study-Related Materials
Role of the Study Description
The Study Description consists of information about the data
collection, study, or compilation that the DDI-compliant
documentation file describes. This section includes information about
how the study should be cited, who collected or compiled the data, who
distributes the data, keywords about the content of the data, summary
(abstract) of the content of the data, data collection methods and
processing, etc. Note that some content of the Study Description's
Citation -- e.g., Responsibility Statement -- may be identical to
that of the Documentation Citation. This is usually the case when
the producer of a data collection also produced the print or
electronic codebook for that data collection.
Study Description
- The access attribute is used to link to the Access Conditions element
describing access and terms of use for the entire dataset.
- Required
- Repeatable
- Attributes: ID, xml:lang, source,
access
- Contains Elements:
- Citation (of Study)
- Required
- Repeatable
- Attributes: ID, xml:lang, source
- Study Scope
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Methodology and Processing (Study Level)
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Data Access
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Other Study Description Materials
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
Citation
Citation's Place within the Study Description
Document
|
|---Document Description
|---Study Description
| |---CITATION
| |---Study Scope
| |---Methodology
| |---Data Access
| |---Other Study Description Materials
|
|---Data Files Description
|---Variables Description
|---Other Study-Related Materials
- <citation> 2.1 (Generic element A.6)
- Description: Citation for the data collection
described by the marked-up documentation. This element encodes the
bibliographic information describing the data collection, including title
information, statement of responsibility, production and distribution
information, series and version information, text of a preferred
bibliographic citation, and notes (if any).
A MARCURI attribute is provided to
link to the MARC record for this citation.
- Optional
- Not Repeatable
-
Attributes: ID, xml:lang, source, MARCURI
- Contains Elements:
Title Statement -- Data Collection,
Responsibility Statement -- Data Collection
Production Statement -- Data Collection,
Distributor Statement -- Data Collection,
Series Statement -- Data Collection,
Version Statement -- Data Collection,
Bibliographic Citation -- Data Collection,
Holdings Information -- Data Collection,
Notes (Citation) -- Data Collection
- Title Statement -- Data Collection
- <titlStmt> 2.1.1 (Generic element A.6.1)
- Description: Title statement for the
data collection..
- Required
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
Title -- Data Collection,
Subtitle -- Data Collection,
Alternative Title -- Data Collection,
Parallel Title -- Data Collection,
ID Number -- Data Collection
- Title -- Data Collection
- <titl> 2.1.1.1 (Generic element A.6.1.1)
- Description: Contains the full authoritative title of the data
collection. The data collection title will in most cases be identical
to the title for the marked-up document (1.1.1.1) and the source document
(1.4.1.1). A full title should
indicate the geographic scope of the data collection as well as the
time period covered.
- Examples:
<titl>Domestic Violence Experience in Omaha, Nebraska, 1986-1987</titl>
<titl>Census of Population, 1950 [United States]: Public Use
Microdata Sample</titl>
<titl>Monitoring the Future: A Continuing Study of American Youth, 1995</titl>
- Required
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Subtitle -- Data Collection
- <subTitl> 2.1.1.2 (Generic element A.6.1.2)
- Description: A subtitle is a secondary title used to amplify or state certain limitations of the main title. It may repeat information already in the main title.
- Examples:
<titl>Monitoring the Future: A Continuing Study of American Youth, 1995</titl>
<subTitl>A Continuing Study of American Youth, 1995</subTitl>
<titl>Census of Population, 1950 [United States]: Public Use
Microdata Sample</titl>
<subTitl>Public Use Microdata Sample</subTitl>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Alternative Title -- Data Collection
- <altTitl> 2.1.1.3 (Generic element A.6.1.3)
- Description: The alternative title may be the title by which a data collection is commonly referred to or it may be an abbreviation for the title.
- Examples:
<titl>Census of Population, 1950 [United States]: Public Use
Microdata Sample</titl>
<altTitl>PUMS</altTitl>
<titl>Equality of Educational Opportunity (Coleman) Study
(EEOS), 1996</titl>
<altTitl>The Coleman Study</altTitl>
<altTitl>EEOS</altTitl>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Parallel Title -- Data Collection
- <parTitl> 2.1.1.4 (Generic element A.6.1.4)
- Description: The title translated into another language.
- Example:
<titl>Politbarometer West [Germany], Partial
Accumulation, 1977-1995</titl>
<parTitl>Politbarometer, 1977-1995: Partielle Kumulation</parTitl>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- ID Number -- Data Collection
- <IDNo> 1.1.1.5 (Generic element A.6.1.5)
- Description: Unique string or number
(producer's or archive's number) for the data collection. An "agency"
attribute is supplied.
- Examples:
<IDNo agency='ICPSR'>6678</IDNo>
<IDNo agency='ZA'>2010</IDNo>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, agency
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Responsibility Statement -- Data Collection
- <rspStmt> 1.1.2 (Generic element A.6.2)
- Description: Responsibility for the data collection.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
Authoring Entity / Primary Investigator -- Data Collection,
Other Identifications / Acknowledgments -- Data Collection
- Authoring Entity / Primary Investigator -- Data Collection
- <AuthEnty> 1.1.2.1 (Generic element A.6.2.1)
- Description: The person, corporate
body, or agency responsible for the data collection's substantive and
intellectual content. Repeat the element for each
author, and use the affiliation attribute if available. Invert first and
last name and use commas.
Remarks: The author in this element
should be the individual(s) or organization(s) directly
responsible for the intellectual content of the data collection, as
distinct from the person(s) or organization(s) responsible for the
intellectual content of the marked-up document.
- Examples:
<AuthEnty>United States Department of Commerce. Bureau of the Census</AuthEnty>
<AuthEnty affiliation='European Commission'>Rabier, Jacques-Rene</AuthEnty>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, affiliation
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Other Identifications / Acknowledgments -- Data Collection
- <othId> 2.1.2.2 (Generic element A.6.2.2)
- Description: Statements of
responsibility not recorded in the title and statement of
responsibility areas. Indicate here the persons or bodies connected
with the work, or significant persons or bodies connected with
previous editions and not already named in the description. For
example, the name of the person who cleaned the data collection might be cited
here, using the role and affiliation attributes.
- Example:
<othId role='processor' affiliation='INRA'>Jane Smith</othId>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, type, role, affiliation
- Contains: <p>, othId
- Production Statement -- Source Document
- <prodStmt> 2.1.3 (Generic element A.6.3)
- Description: Production statement for the data collection.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
Producer -- Data Collection,
Copyright -- Data Collection,
Date of Production -- Data Collection,
Place of Production -- Data Collection,
Software Used in Production -- Data Collection,
Funding Agency -- Data Collection,
Grant Number -- Data Collection
- Producer -- Data Collection
- <producer> 2.1.3.1 (Generic element A.6.3.1)
- Description: The producer of the data
collection is the person or organization with the financial or
administrative responsibility for the physical processes whereby the
data collection was brought into existence. Use the role attribute to
distinguish different stages of involvement in the production process,
such as original producer.
- Example:
<producer abbr='ICPSR' affiliation='Institute for Social Research'>Inter-university Consortium for Political and Social Research</producer>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, abbr, affiliation, role
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Copyright -- Data Collection
- <copyright> 2.1.3.2 (Generic element A.6.3.2)
- Description: Copyright statement for the data collection.
- Example:
<copyright>Copyright(c) ICPSR, 2000</copyright>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Date of Production -- Data Collection
- <prodDate> 2.1.3.3 (Generic element A.6.3.3)
- Description: Date the data collection
was produced (not distributed or archived). The ISO
standard for dates (YYYY-MM-DD) is recommended for use with the date
attribute.
- Example:
<prodDate date='1998-07-21'>July 21, 1998</prodDate>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, date
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Place of Production -- Data Collection
- <prodPlac> 2.1.3.4 (Generic element A.6.3.4)
- Description: Address of the archive or agency that produced the data collection.
- Example:
<prodPlac>Ann Arbor, MI: Inter-university Consortium for Political and Social Research</prodPlac>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Software Used in Production -- Data Collection
- <software> 2.1.3.5 (Generic element A.6.3.5)
- Description: Identifies the software
used in creating or storing the data collection. A "version" attribute
permits specification of the software version number. The "date"
attribute is provided to enable specification of the date (if any) for
the software release. The ISO standard for dates (YYYY-MM-DD) is
recommended for use with the date attribute.
- Example:
<software version='6.12'>SAS</software>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, version, date
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Funding Agency -- Data Collection
- <fundAg> 2.1.3.6 (Generic element A.6.3.6)
- Description: The source(s) of funds
for production of the data collection. If different funding agencies
sponsored different stages of the production process, use the role
attribute to distinguish them.
- Example:
<fundAg abbr='NSF'>National Science Foundation</fundAg>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, abbr, role
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Grant Number -- Data Collection
- <grantNo> 2.1.3.7 (Generic element A.6.3.7)
- Description: The grant/contract
number of the project that sponsored the data collection effort. If
more than one, indicate the appropriate agency using the "agency"
attribute. If different funding agencies
sponsored different stages of the production process, use the role
attribute to distinguish the grant numbers.
- Example:
<grantNo agency='Bureau of Justice Statistics'>J-LEAA-018-77</grantNo>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, agency, role
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Distributor Statement -- Data Collection
- <distStmt> 2.1.4 (Generic element A.6.4)
- Description: Distribution statement
for the data collection.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
Distributor -- Data Collection,
Contact Person -- Data Collection,
Depositor -- Data Collection,
Date of Deposit -- Data Collection,
Date of Distribution -- Data Collection
- Distributor -- Data Collection
- <distrbtr> 2.1.4.1 (Generic element A.6.4.1)
- Description: The organization
designated by the author or producer to generate copies of a
particular data collection including any necessary editions or
revisions. Names and addresses may
be specified, and other archives may be co-distributors. A URI
attribute is included to provide an URN or URL to the ordering service
or download facility on a website.
- Example:
<distrbtr abbr='ICPSR" affiliation='Institute for Social
Research' URI='http://www.icpsr.umich.edu'>Ann Arbor, MI: Inter-university Consortium for Political
and Social Research</distrbtr>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, abbr, affiliation, URI
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Contact Person -- Data Collection
- <contact> 2.1.4.2 (Generic element A.6.4.2)
- Description: Names and addresses of individuals responsible for the data collection. May be PIs. Individuals listed as contact persons will be used as resource persons regarding problems or questions raised by the user community. The URI attribute should be used to indicate a URN or URL for the homepage of the contact individual. The email attribute is used to indicate an email address for the contact individual.
- Example:
<contact affiliation='University of Wisconsin' email="jsmith@...'>Jane Smith</contact>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, affiliation, URI, email
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Depositor -- Data Collection
- <depositr> 2.1.4.3 (Generic element A.6.4.3)
- Description: The name of the person (or institution) who provided this data collection to the archive storing it.
- Example:
<depositr abbr='BJS' affiliation='U.S. Department of Justice'>Bureau of Justice Statistics</depositr>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, abbr, affiliation
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Date of Deposit -- Data Collection
- <depDate> 2.1.4.4 (Generic element A.6.4.4)
- Description: The date that the data
collection was deposited with the archive that originally received
it. The ISO
standard for dates (YYYY-MM-DD) is recommended for use with the date
attribute.
- Example:
<depDate date='1999-01-25'>January 25, 1999</depDate>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, date
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Date of Distribution -- Data Collection
- <distDate> 2.1.4.5 (Generic element A.6.4.5)
- Description: The date that the data
collection was released for distribution. The ISO
standard for dates (YYYY-MM-DD) is recommended for use with the date
attribute.
- Example:
<distDate date='1999-01-25'>January 25, 1999</distDate>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, date
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Series Statement -- Data Collection
- <serStmt> 2.1.5 (Generic element A.6.5)
- Description: Series statement for the
data collection. The URI attribute is provided to point to a central
Internet repository of series information.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, URI
- Contains Elements:
Series Name -- Data Collection,
Series Information -- Data Collection
- Series Name -- Data Collection
- <serName> 2.1.5.1 (Generic element A.6.5.1)
- Description: The name of the data series to which the collection belongs
- Example:
<serName abbr='CPS'>Current Population Survey Series</serName>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, abbr
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Series Information -- Data Collection
- <serInfo> 2.1.5.2 (Generic element A.6.5.2)
- Description: Contains a history of
the data series and a summary of those features that apply to the
data series as a whole.
- Example:
<serInfo>The Current Population Survey (CPS) is
a household sample survey conducted monthly by the Census Bureau to
provide estimates of employment, unemployment, and other characteristics
of the general labor force, estimates of the population as a whole,
and estimates of various subgroups in the population. The entire
non-institutionalized population of the United States is sampled to
obtain the respondents for this survey series.</serInfo>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Version Statement -- Data Collection
- <verStmt> 2.1.6 (Generic element A.6.6)
- Description: Version statement for
the data collection.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
Version -- Data Collection,
Version Responsibility Statement -- Data Collection,
Notes (Version) -- Data Collection
- Version -- Data Collection
- <version> 2.1.6.1 (Generic element A.6.6.1)
- Description: Also known as release or
edition. If there have been substantive changes in the data collection
since its creation, this statement should be used. The ISO
standard for dates (YYYY-MM-DD) is recommended for use with the date
attribute.
Remarks: ICPSR distinguishes among the terms "release," "version," and
"edition" in the following ways:
- ICPSR Edition: Used only
for intensively processed collections, for which ICPSR has produced a
unique edition of the data. This usually involves checking for
undocumented codes and consistency checks. Signals that additional
intellectual effort has gone into producing the collection.
- ICPSR Version: Used to indicate that
ICPSR has revised the format of a collection or added components
to it, in most cases without
changing any data values. A study is considered an "ICPSR version"
if one or more of these steps has been performed:
(1) Converting software-specific system files or export/transport
files to raw data;
(2) Generating SAS and/or SPSS data definition statements;
(3) Reformatting files, e.g., removing blanks to use space more
efficiently;
(4)Scanning hardcopy documentation; or
(5)Reformatting machine-readable documentation, e.g., converting
text created in a word-processing package to ASCII text.
- Release: Used for data collections that are
being disseminated exactly as they came from the data depositor
(except for the addition of an ICPSR cover and ICPSR front matter).
- Example:
<version type='edition' date='1999-01-25'>Second ICPSR Edition</version>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type (release, version, edition), date
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Version Responsibility Statement -- Data Collection
- <verResp> 2.1.6.2 (Generic element A.6.6.2)
- Description: Used to indicate the
organization or person responsible for the version of the data
collection.
- Example:
<verResp>Zentralarchiv fuer Empirische Sozialforschung</verResp>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, affiliation
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Notes (Version) -- Data Collection
- <notes> 2.1.6.3 (Generic element A.6.6.3)
- Description: Used to indicate
additional information regarding the version or the version
responsibility statement for the data collection, in particular to indicate what makes a new
version differnt from its predecessor. "Notes" sections appear in
several places in the DTD. The attributes for notes permit a
controlled vocabulary to be developed (type and subject), the level of
the DTD to which the note refers to be identified (study, file,
variable, etc.), and the author of the note to be indicated
(resp).
- Example:
<notes resp='Jane Smith'>Data for 1998 have been added to this version of the data collection.</notes>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type, subject level, resp
- Contains: #PCDATA, Link to other
element(s) within the codebook, reference to a table.
- Bibliographic Citation -- Data Collection
- <biblCit format='MRDF'> 2.1.7 (Generic element A.6.7)
- Description: Complete bibliographic reference containing all of the standard elements of a citation that can be used to cite the data collection. The "format" attribute is provided to enable specification of the particular citation style used, e.g. APA, MLA, Chicago, etc.
- Example:
<biblCit>Rabier, Jacques-Rene, and Ronald
Inglehart. EURO-BAROMETER 11: YEAR OF THE CHILD IN EUROPE, APRIL 1979
[Computer file]. Conducted by Institut Francais D'Opinion Publique
(IFOP), Paris, et al. ICPSR ed. Ann Arbor, MI: Inter-university
Consortium for Political and Social Research [producer and
distributor], 1981. </biblCit>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, format
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Holdings Information -- Data Collection
- <holdings> 2.1.8 (Generic element A.6.8)
- Description: Information concerning
either the physical or electronic holdings of the cited work. Attributes
include: location--The physical location where a copy is held;
callno--The call number for a work at the location specified; and
URI--A URN or URL for accessing the electronic copy of the cited
work.
- Example:
<holdings location='University of Michigan Graduate Library callno='inap.'
URI='http://www.umich.edu/library/'>
Data File for Current Population Survey, 1999: Annual Demographic
File</holdings>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, location, callno, URI
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Notes (Citation) -- Data Collection
- <notes> 2.1.9 (Generic element A.4)
- Description: Used to indicate
additional information regarding the citation for the data collection.
"Notes" sections appear in
several places in the DTD. The attributes for notes permit a
controlled vocabulary to be developed (type and subject), the level of
the DTD to which the note refers to be identified (study, file,
variable, etc.), and the author of the note to be indicated
(resp).
- Example:
<notes resp='Jane Smith'>This citation was sent to ICPSR by the
agency depositing the data.</notes>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type, subject level, resp
- Contains: #PCDATA, Link to other
element(s) within the codebook, reference to a table.
Study Scope
Study Scope's Place within the Document Structure
Document
|
|---Document Description
|---Study Description
| |---Citation
| |---STUDY SCOPE
| |---Methodology And Processing (Study Level)
| |---Data Access
| |---Other Study Description Materials (Encoder-defined)
|
|---Data Files Description
|---Variable Description
|---Other Study-Related Materials
To comply with the Dublin Core, it is recommended that the following
elements in the Study Scope section be used when the appropriate
information is available:
DUBLIN CORE DDI
------------------
Subject 2.2.1.1 keyword (Keywords)
2.2.1.2 topcClas (Topic Classification)
Description 2.2.2 abstract (Abstract)
Coverage 2.2.3.1 timePrd (Time Period Covered)
2.2.3.2 collDate (Date of Collection)
2.2.3.3 nation (Country)
2.2.3.4 geogCover (Geographic Coverage)
- Study Scope
- <stdyInfo> 2.2
- Description: This section contains information about the data collection's
scope across several dimensions, including substantive content, geography,
and time.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: Subject Information,
Abstract,
Summary Data Description,
Notes
- Subject Information
- <subject> 2.2.1
- Description: Subject information describing the data collection's
intellectual content.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains:
Keyword,
Topic Classification
- Keyword
- <keyword> 2.2.1.1
- Description: Words or phrases that
describe salient aspects of a data collection's content. Can be used for
building keyword indexes and for classification and retrieval purposes. A
controlled vocabulary can be employed. Maps to Dublin Core
Subject. The vocab attribute is provided for specification of the
controlled vocabulary in use, e.g., LCSH, MeSH, etc. The vocabURI attribute
specifies the location for the full controlled vocabulary.
- Examples:
<keyword>quality of life</keyword>
<keyword>family</keyword>
<keyword>career goals</keyword>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, vocab, vocabURI
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Topic Classification
- <topcClas> 2.2.1.2
- Description: The classification field indicates
the broad substantive topic(s) that the data cover. Library of
Congress subject terms may be used here. The vocab attribute is
provided for specification of the controlled vocabulary in use, e.g.,
LCSH, MeSH, etc. The vocabURI attribute specifies the location for the
full controlled vocabulary. Maps to Dublin Core Subject.
- Examples:
<topcClas ICPSR Subject Headings>Mass Political Behavior and Attitudes</topcClas>
<topcClas ICPSR Subject Headings>Social Indicators</topcClas>
<topcClas vocab='LOC Subject Headings'>Public opinion -- California -- Statistics</topcClas>
<topcClas vocab='LOC Subject Headings'>Elections -- California</topcClas>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, vocab, vocabURI
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Abstract
- <abstract> 2.2.2
- Description: An unformatted summary
describing the purpose, nature, and scope of the
data collection, special characteristics of its contents, major
subject areas covered, and what questions the PIs attempted to answer
when they conducted the study. A listing of major variables in the
study is important here. In cases where a codebook contains more than one
abstract (for example, one might be supplied by the data producer and another
prepared by the data archive where the data are deposited), the source and
date attributes may be used to distinguish the abstract versions.
Maps to Dublin Core Description. Inclusion of this element is recommended.
Date attribute should follow ISO convention of YYYY-MM-DD.
- Example:
<abstract date = '1999-01-28' source='ICPSR'> Data on labor force activity for the week
prior to the survey are supplied in this collection. Information is
available on the employment status, occupation, and industry of
persons 15 years old and over. Demographic variables such as age, sex,
race, marital status, veteran status, household relationship,
educational background, and Hispanic origin are included. In addition
to providing these core data, the May survey also contains a
supplement on work schedules for all applicable persons aged 15 years
and older who were employed at the time of the survey. This supplement
focuses on shift work, flexible hours, and work at home for both main
and second jobs.</abstract>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, date
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Summary Data Description
- <sumDscr> 2.2.3
- Description: Information about a study's chronological and
geographic coverage and unit of analysis.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains:
Time Period Covered,
Date of Collection,
Country,
Geographic Coverage,
Geographic Unit,
Unit of Analysis,
Universe,
Kind of Data
- Time Period Covered
- <timePrd> 2.2.3.1
- Description: The time period to which the data
refer. This item reflects the time period covered by the data, not the
dates of coding or making documents machine-readable or the dates the
data were collected. Also known as span. Use the event attribute to specify
"start", "end", or "single" for each date entered. The ISO
standard for dates (YYYY-MM-DD) is recommended for use with the date
attribute. Maps to Dublin Core Coverage. Inclusion of this element is recommended.
- Examples:
<timePrd event='start' date='1998-05-01'>May 1, 1998</timePrd>
<timePrd event='end' 'date=1998-05-31'>May 31, 1998</timePrd>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, event, date
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Date of Collection
- <collDate> 2.2.3.2
- Contains the date(s) when the data were collected. Use the event
attribute to specify "start", "end", or
"single" for each date entered to distinguish between, for example,
the first day of collection (start), only day of collection (single),
and last day of collection (end). The ISO standard for dates
(YYYY-MM-DD) is recommended for use with the date attribute.
Maps to Dublin Core Coverage. Inclusion of this element in the codebook is recommended.
- Example:
<collDate event='single' date='1998-11-10'>10 November 1998</collDate>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, event, date
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Country
- <nation> 2.2.3.3
- Description: Indicates the country or countries
covered in the file. Attribute "abbr" may be used to match the
attributes given to agencies, etc. and to provide an equivalent to the TEI
placePart entity, which adds "type" and "full" attributes.
Maps to Dublin Core Coverage. Inclusion of this element is recommended.
- Example:
<nation abbr='U.K.'>United Kingdom</nation>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, abbr
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Geographic Coverage
- <geogCover> 2.2.3.4
- Information on the geographic
coverage of the data. Include the total geographic scope of
the data, and any additional levels of geographic coding provided in
the variables. Maps to Dublin Core Coverage. Inclusion of this element is recommended.
Example:
<geogCover>State of California</geogCover>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains #PCDATA.
- Geographic Unit
- <geogUnit> 2.2.3.5
- Description: Lowest level of geographic aggregation covered by the data.
- Example:
<geogUnit>state</geogUnit>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Unit of Analysis
- <anlyUnit> 2.2.3.6
- Description: Basic unit of analysis or observation
that the file describes: individuals, families/households, groups,
institutions/organizations, administrative units, etc. The "unit" attribute
is included to permit the development of a controlled vocabulary for this
element.
- Example:
<anlyUnit>individuals</anlyUnit>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, unit
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Universe
- <universe> 2.2.3.7
- Description: A description of the population
covered by the data in the file; the group of persons or other
elements that are the object of the study and to which the study
results refer. Age, nationality, and residence commonly help to
delineate a given universe, but any of a number of factors may be
involved, such as age limits, sex, marital status, race, ethnic group,
nationality, income, veteran status, criminal convictions, etc. The
universe may consist of elements other than persons, such as housing
units, court cases, deaths, countries, etc. In general, it should be
possible to tell from the description of the universe whether a given
individual or element (hypothetical or real) is a member of the
population under study. Also known as universe of interest, population
of interest, and target population. A "level" attribute is included to
permit coding of the level to which universe applies, i.e., the study
level, the file level (if different from study), or the variable level.
The "clusion" attribute provides for specification of groups included (I) in
or excluded (E) from the universe.
- Example:
For a universe that excludes persons living in institutions or military
barracks:
<universe level='study' clusion='I'>The resident
population of the United States.</universe>
<universe level='study' clusion='E'>Persons living in
institutions and military barracks.</universe>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, level, clusion
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Kind of Data
- <dataKind> 2.2.3.8
- Description: The type of data included in the file: survey data, census/enumeration
data, aggregate data, clinical data, event/transaction data, program
source code, machine-readable text, administrative records data,
experimental data, psychological test, textual data, coded textual,
coded documents, time budget diaries, observation data/ratings,
process-produced data, etc.
- Example:
<dataKind>survey data</dataKind>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Notes
- <notes> 2.2.4 (Generic element A.4)
- Description: Used to indicate additional information
regarding the scope of a data collection. "Notes" sections appear in several places in
the DTD. The attributes for notes permit a controlled vocabulary to be
developed (type and subject), the level of the DTD to which the note
refers to be identified (study, file, variable, etc.), and the author
of the note to be indicated (resp).
- Example:
<notes>Data on employment and income refer to the
preceding year, although demographic data refer to the time of the
survey.</notes>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type, subject, level, resp
- Contains: #PCDATA, Link to other
element(s) within the codebook, reference to a table.
Study Level Methodology and Processing
Document
|
|---Document Description
|---Study Description
| |
| |---Citation
| |---Study Scope
| |---METHODOLOGY AND PROCESSING
| |---Data Access
| |---Other Study Description Materials
|
|---Data Files Description
|---Variable Description
|---Other Study-Related Materials
Methodology and Processing
- <method> 2.3
- Description: This section describes the methodology and processing
involved in a data collection.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: Data Collection Methodology,
Notes,
Data Appraisal,
Study Status
- Data Collection Methodology
- <dataColl> 2.3.1
- Description: Information about the methodology employed in a
data collection.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: Time Method,
Data Collector,
Frequency,
Sampling Procedure,
Major Deviations from Sample Design,
Mode of Data Collection,
Type of Research Instrument,
Sources Statement,
Characteristics of the Data Collection Situation,
Actions to Minimize Losses,
Control Operations,
Weighting,
Cleaning Operations
- Time Method
- <timeMeth> 2.3.1.1
- The time method or time dimension of
the data collection. The "method" attribute is included to permit the
development of a controlled vocabulary for this element.
- Examples:
<timeMeth>panel survey</timeMeth>
<timeMeth>cross-section</timeMeth>
<timeMeth>trend study</timeMeth>
<timeMeth>time-series</timeMeth>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, method
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Data Collector
- <dataCollector> 2.3.1.2
- Description: The entity (individual, agency, or
institution) responsible for administering the questionnaire or
interview or compiling the data. This refers to the entity collecting the data,
not to the entity producing the documentation.
- Example:
<dataCollector abbr='SRC' affil='University of Michigan'>Survey Research
Center</dataCollector>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, abbr, affiliation
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Frequency of Data Collection
- <frequenc> 2.3.1.3
- Description: If the data collected include more
than one point in time, indicate the frequency with which the data
were collected. The "frequency" attribute is included to permit the
development of a controlled vocabulary for this element.
- Examples:
<frequenc>monthly</frequenc>
<frequenc>quarterly</frequenc>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, freq
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Sampling Procedure
- <sampProc> 2.3.1.4
- Description: The type of sample and sample design
used to select the survey respondents to represent the population.
May include reference to the target
sample size and the sampling fraction.
- Examples:
<sampProc>National multistage area probability sample</sampProc>
<sampProc>Simple random sample</sampProc>
<sampProc>Stratified random sample</sampProc>
<sampProc>Quota sample</sampProc>
<sampProc>The 8,450 women interviewed for the NSFG, Cycle IV, were drawn from
households in which someone had been
interviewed for the National Health Interview Survey (NHIS),
between October 1985 and March 1987.</sampProc>
<sampProc>Samples sufficient to produce approximately 2,000 families with
completed interviews were drawn in each state.
Families containing one or more Medicaid or uninsured persons were
oversampled.</sampProc>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Major Deviations from the Sample Design
- <deviat> 2.3.1.5
- Description: Show correspondence as well as
discrepancies between the sampled units (obtained) and
available statistics for the population (age, sex-ratio, marital
status, etc.) as a whole.
- Example:
<deviat>The suitability of Ohio as a research site reflected
its similarity to the United States as a whole. The evidence extended by
Tuchfarber (1988) shows that Ohio is representative of the United States in
several ways: percent urban and rural, percent of the population
that is African-American, median age, per capita income, percent living
below the poverty level, and unemployment rate. Although results
generated from an Ohio sample are not empirically generalizable to the
United States, they may be suggestive of what might be expected
nationally.</deviat>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Mode of Data Collection
- <collMode> 2.3.1.6
- Description: The method used to collect the data;
instrumentation characteristics.
- Examples:
<collMode>telephone interviews</collMode>
<collMode>face-to-face interviews</collMode>
<collMode>mail questionnaires</collMode>
<collMode>computer-aided telephone interviews (CATI)</collMode>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Type of Research Instrument
- <resInstru> 2.3.1.7
- Description: The type of data collection instrument
used. "Structured" indicates an
instrument in which all respondents are asked the same
questions/tests, possibly with precoded answers. If a small
portion of such a questionnaire includes open-ended questions,
provide appropriate comments.
"Semi-structured" indicates that the research instrument contains
mainly open-ended questions. "Unstructured" indicates that in-depth
interviews were conducted. The "type" attribute is included to permit the
development of a controlled vocabulary for this element.
- Example:
<resInstru>structured</resInstru>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, type
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Sources Statement
- <sources> 2.3.1.8
- Description of sources used for the
data collection.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA or Data Sources,
Origins of Sources,
Characteristics of Sources Noted,
Documentation/Access to Sources
- Data Sources
- <dataSrc> 2.3.1.8.1
- Description: Used to list
the book(s), article(s), serial(s), and/or machine-readable data
file(s)--if any--that served as the source(s) of the data collection.
- Examples:
<dataSrc>
''Voting Scores.'' CONGRESSIONAL QUARTERLY ALMANAC 33 (1977),
487-498.</dataSrc> <dataSrc>United States Internal
Revenue Service Quarterly Payroll File</dataSrc>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Origins
of Sources
- <srcOrig> 2.3.1.8.2
- Description: For historical materials, information
about the origin(s) of the sources and the rules followed in establishing
the sources should be specified. May not be relevant to survey data.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Characteristics of Sources
Noted
- <srcChar> 2.3.1.8.3
- Description: Assessment of
characteristics and quality of source material. May not be relevant to
survey data.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Documentation/Access to
Sources
- <srcDocu> 2.3.1.8.4
- Description: Level of documentation of
the original sources. May not be relevant to survey data.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Sources
- <sources> 2.3.1.8.5
- No element or attribute declaration here,
as this element is simply a recursive declaration within sources 2.3.1.8.
- Characteristics of
the Data Collection Situation
- <collSitu> 2.3.1.9
- Description: Used to describe noteworthy aspects of
the data collection situation. Include information on factors such as
cooperativeness of respondents, duration of interviews, number of
call-backs, etc.
- Example:
<collSitu>There were 1,194 respondents who answered questions in
face-to-face interviews lasting approximately 75 minutes each.</collSitu>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Actions to Minimize Losses
- <actMin> 2.3.1.10
- Description: Summary of actions taken to minimize
data loss. Include information on actions such as
follow-up visits, supervisory checks, historical matching, estimation,
etc.
- Example:
<actMin>To minimize the number of unresolved cases and reduce the potential
nonresponse bias, four follow-up contacts were made with agencies that had not responded by
various stages of the data collection process.</actMin>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Control Operations
- <ConOps> 2.3.1.11
- Description: Methods to facilitate data control
performed by the primary investigator or by the data archive. Sepcify any special programs
used for such operations. The "agency" attribute maybe used to refer to the agency that
performed the control operation.
- Example:
<ConOps source='ICPSR'>Ten percent of data entry forms
were reentered to check for accuracy.</ConOps>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source,agency
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Weighting
- <weight> 2.3.1.12
- Description: The use of sampling procedures may
make it necessary to apply weights to produce accurate statistical
results. Describe here the criteria for using weights in analysis of
a collection. If a weighting formula or coefficient was developed,
provide this formula, define its elements, and indicate how the
formula is applied to data.
- Example:
<weight>The 1996 NES dataset includes two final person-level
analysis weights which incorporate sampling, nonresponse, and
post-stratification factors. One weight (variable #4) is for
longitudinal micro-level analysis using the 1996 NES Panel. The other
weight (variable #3) is for analysis of the 1996 NES combined sample
(Panel component cases plus Cross-section supplement cases). In
addition, a Time Series Weight (variable #5) which corrects for Panel
attrition was constructed. This weight should be used in analyses
which compare the 1996 NES to earlier unweighted National Election
Study data collections.</weight>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Cleaning Operations
- <cleanOps> 2.3.1.13
- Description: Methods used to "clean" the data collection,
e.g., consistency checking, wildcode checking, etc.
- Example:
<cleanOps>Checks for undocumented codes were performed, and data were
subsequently revised in consultation with the principal investigator.</cleanOps>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Notes
- <notes> 2.3.2 (Generic Element A.4)
- Description: Used to indicate additional information about
the methodology and processing involved in a collection. Include error notes
here. "Notes" sections appear in several places in
the DTD. The attributes for notes permit a controlled vocabulary to be
developed (type and subject), the level of the DTD to which the note
refers to be identified (study, file, variable, etc.), and the author
of the note to be indicated (resp).
- Example:
<notes>Undocumented codes were found in this data collection. Missing data are
represented by blanks.</notes>
<notes>For this collection, which focuses on employment, unemployment, and gender
equality, data from EUROBAROMETER 44.3: HEALTH CARE ISSUES AND PUBLIC
SECURITY, FEBRUARY-APRIL 1996 (ICPSR 6752) were merged with an oversample.</notes>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type, subject, level, resp
- Contains: #PCDATA, Link to other element(s) within
the codebook, reference to a table.
- Data Appraisal Information
- <anlyInfo> 2.3.3
- Description: Information on data appraisal.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains:
Response Rates,
Estimates of Sampling Error,
Other Forms of Data Appraisal
- Response Rates
- <respRate> 2.3.3.1
- Description: The percentage of sample members who
provided information.
- Examples:
<respRate>For 1993, the estimated inclusion
rate for TEDS-eligible providers was 91 percent, with the inclusion
rate for all treatment providers estimated at 76 percent (including
privately and publicly funded providers).</respRate>
<respRate>The overall response rate was 82%, although retail firms with an annual sales volume of more than $5,000,000
were somewhat less likely to respond.</respRate>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Estimates of Sampling Error
- <EstSmpErr> 2.3.3.2
- Description: Measure of how precisely one can
estimate a population value from a given sample.
- Example:
<EstSmpErr> To assist NES
analysts, the PC SUDAAN program was used to compute sampling errors
for a wide-ranging example set of proportions estimated from the 1996
NES Pre-election Survey dataset. For each estimate, sampling errors
were computed for the total sample and for twenty demographic and
political affiliation subclasses of the 1996 NES Pre-election Survey
sample. The results of these sampling error computations were then
summarized and translated into the general usage sampling error table
provided in Table 11. The mean value of deft, the square root of the
design effect, was found to be 1.346. The design effect was primarily
due to weighting effects (Kish, 1965) and did not vary significantly
by subclass size. Therefore the generalized variance table is
produced by multiplying the simple random sampling standard error for
each proportion and sample size by the average deft for the set of
sampling error computations.</EstSmpErr>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Other Forms of Data Appraisal
- <dataAppr> 2.3.3.3
- Description: Other issues pertaining to data
appraisal. Describe here issues such as response variance, nonresponse
rate and testing for bias, interviewer and response bias, confidence
levels, question bias, etc.
- Examples:
<dataAppr>These
data files were obtained from the United States House of
Representatives, who received them from the Census Bureau accompanied
by the following caveats: ''The numbers contained herein are not
official 1990 decennial Census counts. The numbers represent estimates
of the population based on a statistical adjustment method applied to
the official 1990 Census figures using a sample survey intended to
measure overcount or undercount in the Census results. On July 15,
1991, the Secretary of Commerce decided not to adjust the official
1990 decennial Census counts (see 56 Fed. Reg. 33582, July 22,
1991). In reaching his decision, the Secretary determined that there
was not sufficient evidence that the adjustment method accurately
distributed the population across and within states. The numbers
contained in these tapes, which had to be produced prior to the
Secretary's decision, are now known to be biased. Moreover, the tapes
do not satisfy standards for the publication of Federal statistics, as
established in Statistical Policy Directive No. 2, 1978, Office of
Federal Statistical Policy and Standards. Accordingly, the Department
of Commerce deems that these numbers cannot be used for any purpose
that legally requires use of data from the decennial Census and
assumes no responsibility for the accuracy of the data for any purpose
whatsoever. The Department will provide no assistance in
interpretation or use of these numbers.''</dataApp>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Class or Status of the Study
- <stdyClas> 2.3.4
- Description: Generally used to give the data archive's class or study status number, which indicates the processing status of the study.
May also be used as a text field to describe processing status.
- Examples:
<stdyClas>ICPSR Class II</stdyClas>
<stdyClas>DDA Class C</stdyClas>
<stdyClas>Available from the DDA. Being processed. </stdyClas>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type
- Contains: #PCDATA, Link to other element(s) within the codebook.
Data Access
Data Access's Place within the Document Structure
Document
|
|---Document Description
|---Study Description
| |
| |---Citation
| |---Study Scope
| |---Methodology and Processing
| |---DATA ACCESS
| |---Other Study Description Materials
|
|---Data Files Description
|---Variable Description
|---Other Study-Related Materials
- <dataAccs> 2.4
- This section describes access conditions and terms of use
for the data collection. In cases where access conditions differ across
individual files or variables, multiple access conditions can
be specified. The access conditions
applying to a study, file, variable group, or variable can be indicated
by an IDREF attribute on the study (2.0), file (3.0), variable group
(4.1), or variable (4.2) elements called "access".
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
Data Collection Availability,
Data Use Statement,
Notes
- Data Collection Availability
- <setAvail> 2.4.1
- Information on availability and storage of the
collection. The "media" attribute may be used in combination with any
of the subelements. See Location of Data Collection below.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, media
- Contains Elements:
Location of Data Collection,
Original Archive Where Collection Stored
Availability Status,
Extent of Collection,
Completeness of Collection Stored,
Number of Files,
Notes
- Location of Data Collection
- <accsPlac> 2.4.1.1
- Location where the data collection is
currently stored. Use the URI attribute to provide a URN or URL for
the storage site or the actual address from which the data may be
downloaded.
Examples:
<setAvail media='CDROM'>
<accsPlac URL='http://www.icpsr.umich.edu'>Inter-university
Consortium for Political and Social Research</accsPlac>
</setAvail>
<setAvail media='online'>
<accsPlac URL='http://www.ssd.gu.se/'>Swedish Social Science Data Service
</accsPlac>
</setAvail>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, URI
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Original Archive Where
Collection Stored
- <origArch> 2.4.1.2
- Archive from which the data collection was
obtained; the originating archive.
Example:
<origArch>Zentralarchiv fuer empirische Sozialforschung</origArch>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Availability Status
- <avlStatus> 2.4.1.3
- Statement of collection availability.
An archive may need to indicate that a collection is unavailable because it
is embargoed for a period of time, because it has been superseded, because a
new edition is imminent, etc. It is anticipated that a controlled vocabulary
will be developed for this element.
Example:
<avlStatus>This
collection is superseded by CENSUS OF POPULATION, 1880 [UNITED STATES]: PUBLIC USE SAMPLE
(ICPSR 6460).</avlStatus>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Extent of Collection
- <collSize> 2.4.1.4
- Summarizes the number of physical
files that exist in a collection, recording the number of files that
contain data and noting whether the collection contains
machine-readable documentation and/or other supplementary files and
information such as data dictionaries, data definition statements, or
data collection instruments.
Example:
<collSize>1 data file +
machine-readable documentation (PDF) + SAS data definition
statements</collSize>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Completeness of Collection Stored
- <complete> 2.4.1.5
- This item indicates the relationship
of the data collected to the amount of data coded and stored in the
data collection. Information as to why certain items of collected
information were not included in the data file stored by the archive
should be provided.
Example:
<complete>Because of embargo provisions, data values for some
variables have been masked. Users should consult the data definition
statements to see which variables are under embargo. A new version of
the collection will be released by ICPSR after embargoes are
lifted.</complete>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Number of Files
- <fileQnty> 2.4.1.6
- Total number of physical files associated with a
collection.
Example:
<fileQnty> 5 files</fileQnty>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Notes
- <notes> 2.4.1.7 (Generic element A.4)
- Indicate additional information regarding
data availability. "Notes" sections appear in several places in
the DTD. The attributes for notes permit a controlled vocabulary to be
developed (type and subject), the level of the DTD to which the note
refers to be identified (study, file, variable, etc.), and the author
of the note to be indicated (resp).
Example:
<notes> Data from the Bureau of Labor Statistics used in
the analyses for the final report are not provided as part of this
collection.</notes>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other
element(s) within the codebook, reference to a table.
- Data Use Statement
- <useStmt> 2.4.2
- Information on terms of use for the data collection.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
Confidentiality Declaration,
Special Permissions,
Restrictions,
Access Authority,
Citation Requirement,
Deposit Requirement,
Access Conditions ,
Disclaimer
- Confidentiality Declaration
- <confDec> 2.4.2.1
- This element is used to determine if signing of a
confidentiality declaration is needed to access a resource.
The "required" attribute is used to aid machine processing of
this element, and the default specification is "yes".
The "formNo" attribute indicates the number or ID of
the form that the user must fill out. The "URI" attribute may
be used to provide a URN or URL for online access to a
confidentiality declaration form.
Examples:
<confDec formNo='1'>To download this dataset,
the user must sign a declaration of
confidentiality.</confDec>
<confDec URI='http://www.icpsr.umich.edu/HMCA/CTSform/contents.html'>
To obtain this dataset,
the user must complete a Restricted Data Use Agreement.</confDec>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, required, formNo, URI
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Special Permissions
- <specPerm> 2.4.2.2
- This element is used to determine if any special
permissions are required to access a resource.
The "required" attribute is used to aid machine processing of
this element, and the default specification is "yes".
The "formNo" attribute indicates the number or ID of
the form that the user must fill out. The "URI" attribute may
be used to provide a URN or URL for online access to a
special permissions form.
Example:
<specPerm formNo='4'>The user must
apply for special permission to use this dataset
locally and must complete a confidentiality form.</specPerm>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, required, formNo, URI
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Restrictions
- <restrctn> 2.4.2.3
- Any restrictions on access to or use of the
collection such as privacy certification or distribution restrictions should
be indicated here. These
can
be restrictions applied by the author, producer, or disseminator of the
data collection. If the data are restricted to only a certain class of user, specify
which type.
Examples:
<restrctn> In preparing the data file(s) for this collection, the National
Center for Health Statistics (NCHS) has removed direct identifiers and
characteristics that might lead to identification of data subjects. As
an additional precaution NCHS requires, under Section 308(d) of the
Public Health Service Act (42 U.S.C. 242m), that data collected by
NCHS not be used for any purpose other than statistical analysis and
reporting. NCHS further requires that analysts not use the data to
learn the identity of any persons or establishments and that the
director of NCHS be notified if any identities are inadvertently
discovered. ICPSR member institutions and other users ordering data
from ICPSR are expected to adhere to these restrictions.</restrctn>
<restrctn>
ICPSR obtained these data from the World Bank under the terms of a
contract which states that the data are for the sole use of ICPSR and
may not be sold or provided to third parties outside of ICPSR
membership. Individuals at institutions that are not members of the
ICPSR may obtain these data directly from the World Bank.</restrctn>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Access Authority
- <contact> 2.4.2.4 (Generic element A.6.4.2)
- Contact person or organization (with
full address and telephone number, if available) that controls access
to a collection, if different from the data distributor. The "URI" attribute
should be used to indicate a URN or URL for the homepage of the contact
individual. Similarly, the "email" attribute is used to indicate an email
address for the contact individual.
Example:
<contact affil='University of Copenhagen' URI='http://www.etc.'
email='smith@etc.'>The data are
available from the principal investigators, Dr. Smith and Dr. Jones,
at the Sociological Institute, Linnesgade
22, 4. DK-1361 Copenhagen K.</contact>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, affiliation, URI, email
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Citation Requirement
- <citReq> 2.4.2.5
- Text of requirement that a data
collection should be cited properly in
articles or other publications that are based on analysis of the data.
Example:
<citReq>Publications based on ICPSR data collections should
acknowledge those sources by means of bibliographic
citations. To ensure that such source attributions are
captured for social science bibliographic utilities,
citations must appear in footnotes or in the reference
section of publications.</citReq>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Deposit Requirement
- <deposReq> 2.4.2.6
- Information regarding user responsibility
for informing archives of their use of data through providing citations to
the published work or providing copies of the manuscripts.
Example:
<deposReq>
To provide funding agencies with essential information about use of
archival resources and to facilitate the exchange of information about
ICPSR participants' research activities, users of ICPSR data are
requested to send to ICPSR bibliographic citations for, or copies of,
each completed
manuscript or thesis abstract. Please indicate in a cover letter which
data were used.</deposReq>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Access Conditions
- <conditions> 2.4.2.7
- Indicate any additional information
that will assist the user in understanding the access conditions of
the data collection.
Example: <conditions>The data are available without
restriction. Potential users of these datasets
are advised, however, to contact the original principal investigator
Dr. J. Smith (Institute for Social Research, The University of
Michigan, Box 1248, Ann Arbor, MI 48106), about their intended uses of
the data. Dr. Smith would also appreciate
receiving copies of reports based on the datasets.</conditions>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Disclaimer
- <disclaimer> 2.4.2.8
- Information regarding responsibility for
uses of the data collection.
Example:
<disclaimer>The original collector
of the data, ICPSR, and the relevant funding agency bear no responsibility
for uses of this collection or for interpretations or inferences based
upon such uses.</disclaimer>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Notes
- <notes> 2.4.3 (Generic element A.4)
- Indicate within this item any
additional information about access and data use. "Notes" sections
appear in several places in the DTD. The attributes for notes permit a
controlled vocabulary to be developed (type and subject), the level of
the DTD to which the note refers to be identified (study, file,
variable, etc.), and the author of the note to be indicated
(resp).
Examples:
<notes>Users should note that this is a beta version of the data. The
investigators therefore request that users who encounter any problems
with the dataset contact them at the above address.</notes>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, type, subject,
level, responsibility
- Contains: #PCDATA, Link to other
element(s) within the codebook, reference to a table.
Other Study Description Materials
Other Study Description Material's Place within the Document Structure
Document
|
|---Document Description
|---Study Description
| |
| |---Citation
| |---Methodology and Processing
| |---Data Access
| |---OTHER STUDY DESCRIPTION MATERIALS
|
|---Data Files Description
|---Variable Description
|---Other Study-Related Materials
- This section describes other materials that are related to the
study description that are primarily descriptions of the content and
use of the study, such as appendices, sampling information, weighting
details, methodological and technical details, publications based upon
the study content, related studies or collections of studies, etc.
- This section may point to other materials related to the
description of the study through use of the generic citation element (A.6),
which is available for each element in this section.
- Note that Section 5.0, Other Study-Related Materials, should be
used for materials used in the production of the study or useful in
the analysis of the study. The materials in Section 5.0 may be
entered as PCDATA (ASCII text) directly into the document (through use of
the txt element). That
section may also serve as a "container" for other machine-readable
materials by providing a brief description of the study-related
materials accompanied by the "type" and "level" attributes further defining
the materials. Other Study-Related Materials in Section 5.0 may include:
questionnaires, coding notes, SPSS/SAS/STATA setups (and others), user manuals,
continuity guides, sample computer software programs, glossaries of
terms, interviewer/project instructions, maps, database schema, data
dictionaries, show cards, coding information, interview schedules,
missing values information, frequency files, variable maps, etc.
- <othrStdyMat> 2.5
- Description: Other materials relating to the study description.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: Related Material,
Related Study,
Related Publication,
Other References Note
- Related Material
- <relMat> 2.5.1
- Description: Describes materials related to the study description, such as appendices, additional information
on sampling found in other documents, etc. Can take the form of bibliographic citations. This element
can contain either PCDATA or a citation or both, and there can be multiple occurrences of both the
citation and PCDATA within a single element. May consist of a single URI or a series of URIs
comprising a series of citations/references to external
materials which can be objects as a whole (journal articles) or parts of
objects (chapters or appendices in articles or documents).
- Examples:
<relMat> Full details on the research design and procedures, sampling
methodology, content areas, and questionnaire design, as well as
percentage distributions by respondent's sex, race, region, college
plans, and drug use, appear in the annual ISR volumes MONITORING THE
FUTURE: QUESTIONNAIRE RESPONSES FROM THE NATION'S HIGH SCHOOL
SENIORS.</relMat>
<relMat>Current Population Survey, March 1999: Technical Documentation
includes an abstract, pertinent information about the file, a glossary, code
lists, and a data dictionary. One copy accompanies each file order. When ordered
separately, it is available from Marketing Services Office, Customer Service
Center, Bureau of the Census, Washington, D.C. 20233. </relMat>
<relMat>A more precise explanation regarding the CPS sample design is
provided in Technical Paper 40, The Current Population Survey: Design and
Methodology. Chapter 5 of this paper provides documentation on the weighting
procedures for the CPS both with and without supplement questions.</relMat>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains #PCDATA or Citation structure (See
2.1 of Study Description Section), Link to other element(s) within the
codebook.
- Related Study
- <relStdy> 2.5.2
- Description: Information on the relationship of
the current data collection to others (e.g., predecessors, successors,
other waves or rounds) or to other editions of the same file. This
would include the names of additional data collections generated from
the same data collection vehicle plus other collections directed at
the same general topic. Can take the form of bibliographic citations.
- Example:
<relStdy>ICPSR distributes a companion study to this collection titled FEMALE
LABOR FORCE PARTICIPATION AND MARITAL INSTABILITY, 1980: [UNITED
STATES] (ICPSR 9199).</relStdy>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains #PCDATA or Citation structure
(See 2.1 of Study Description Section), Link to other element(s) within the codebook.
- Related Publication
- <relPubl> 2.5.3
- Description: Bibliographic and access information
about articles and reports based on the data in this collection. Can take the form of bibliographic citations.
- Examples:
<relPubl>Economic Behavior Program Staff. SURVEYS OF CONSUMER FINANCES. Annual
volumes 1960 through 1970. Ann Arbor, MI: Institute for Social
Research.</relPubl>
<relPubl>Data from the March Current Population Survey are published most
frequently in the Current Population Reports P- 20 and P- 60 series. These
reports are available from the Superintendent of Documents, U. S. Government
Printing Office, Washington, DC 20402. They also are available on the INTERNET
at http:// www. census. gov. Forthcoming reports will be cited in Census and
You, the Monthly Product Announcement (MPA), and the Bureau of the Census
Catalog and Guide. </relPubl>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains #PCDATA or Citation structure (See 2.1 of Study
Description Section), Link to other element(s) within the codebook.
- Other References Note
- <othRefs> 2.5.4
- Description: Indicate here other pertinent references. Can take the form of bibliographic citations.
- Example:
<othRefs>Part II of the documentation, the Field Representative's Manual, is
provided in hardcopy form only.</othRefs>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains #PCDATA or Citation structure (See 2.1 of Study
Description Section), Link to other element(s) within the codebook.
Data Files Description
Document
|
|---Document Description
|---Study Description
|---DATA FILES DESCRIPTION
|---Variable Description
|---Other Study-Related Materials
The File Description consists of information about the particular data
file(s) containing numeric and/or numeric + textual information that
the DDI-compliant file describes. This section consists of items
describing the characteristics and contents of file(s) that comprise
the study as described in the Study Description. There may be
multiple file descriptions if there are multiple files in the
collection.
- <fileDscr> 3.0
- Description: This section can be repeated for collections
with multiple files.
- The "URI" attribute may be a URN or a URL that can be used to retrieve
the file.
- The "sdatrefs" are summary data description references that
record the ID values of all elements within the summary data
description section of the Study Description
that might apply to the file. These elements
include: time period covered, date of collection, nation or
country, geographic coverage, geographic unit, unit of
analysis, universe, and kind of data.
- The "methrefs" are methodology and processing references
that record the ID values of all elements within the study
methodology and processing section of the Study Description
that might apply to the
file. These elements include information on data collection
and data appraisal (e.g., sampling, sources, weighting, data
cleaning, response rates, and sampling error estimates).
- The "pubrefs" attribute provides a link to publication/citation references
and records the ID values of all citations elements within Section 2.5
or Section 5.0 that pertain to this file.
- "Access" records the ID values of all elements in Section 2.4
of the document that describe access conditions for this file.
- Remarks: When a codebook documents two different physical
instantiations of a data file, e.g., logical record length (or OSIRIS)
and card-image version, the Data File Description (3.0) should be
repeated to describe the two separate files. An ID should be assigned
to each file so that in the Variable section (4.0) the location of
each variable on the two files can be distinguished using the unique
file IDs.
- Examples:
<fileDscr ID='card'
URI='www.icpsr.umich.edu/cgi-bin/archive.prl?path=ICPSR&num=7728'/>
<fileDscr ID='lrecl'
URI='www.icpsr.umich.edu/cgi-bin/archive.prl?path=ICPSR&num=7728'/>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source,
URI, sdatrefs, methrefs, pubrefs, access
- Contains Elements:
File Description,
Notes
- File Description
- <fileTxt> 3.1
- Description: Information about the data file.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
File Name,
Contents of File,
File Structure ,
File Dimensions,
Type of File,
Data Format,
Place of File Production ,
Extent of Processing Checks,
Processing Status,
Missing Data ,
Software Used to Produce the File , Version Statement
- File Name
- <fileName> 3.1.1
- Description: Contains a short title that will be used to
distinguish a particular file/part from other files/parts in the data
collection.
Example:
<fileName ID='File1'>Second-Generation Children Data </fileName>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within
the codebook.
- File Contents
- <fileCont> 3.1.2
- Description: Abstract or description of the file. A summary
describing the purpose, nature, and scope of the data file, special
characteristics of its contents, major subject areas covered, and what
questions the PIs attempted to answer when they created the file. A
listing of major variables in the file is important here. In the
case of multi-file collections, this uniquely describes the contents
of each file.
Example:
<fileCont>Part 1 contains both edited and
constructed variables describing demographic and family relationships,
income, disability, employment, health insurance status, and
utilization data for all of 1987. </fileCont>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within
the codebook.
- File Structure
- <fileStrc> 3.1.2
- Description: Type of file structure. Use attribute of "type"
to indicate hierarchical, rectangular, or relational (the default is
rectangular).
- Remarks: If the file is rectangular, skip to File Dimensions (3.1.4).
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements: Record
Group, Notes
- Record or Record Group
- <recGrp> 3.1.3.1
- Description: Used to describe record groupings if the file is
hierarchical or relational. The attribute "recGrp" allows a record
group to indicate subsidiary record groups that nest underneath; this
allows for the encoding of a hierarchical structure of record groups.
The attribute "rectype" indicates the type
of record, e.g., "'A' records" or "Household records." "Keyvar" is an IDREF
that provides the link to other record types. In a hierarchical study
consisting of individual and household records, the "keyvar" on the
person record will indicate the household to which it belongs. The
"recidvar" is the unique ID of the record group itself.
Example:
<fileStrc type='hierarchical'>
<recGrp rectype='A'>CPS Person-Level Records</recGrp>
</fileStrc>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, recGrp, rectype,
keyvar, recidvar
- Contains Elements:
Record Label,
Record Dimensions
- Record Label
- <labl> 3.1.3.1.1 (Generic element A.2)
- Description: A more descriptive specification of record group.
A "level" attribute is included to
permit coding of the level to which the label applies, i.e., the study
level, the file level (if different from study), the record level,
the variable group level, or the variable level. A "vendor" attribute is
provided to allow for specification of different labels for use with
different vendors' software.
Example:
<fileStrc type='hierarchical'><recGrp rectype='A' keyvar='H-SEQ'
recidvar='PRECORD'>
<labl>Person (A) Record</labl></recGrp></fileStrc>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, level, vendor
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Record Dimensions
- <recDimnsn> 3.1.3.1.2
- Description: Information about the physical characteristics
of the record. The "level" attribute on this element should be set to
"record."
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, level
- Contains Elements:
Variable Quantity (of Record),
Record Quantity (of Record),
Logical Record Length (of Record)
- Variable Quantity (of Record)
- <varQnty> 3.1.3.1.2.1
- Description: Number of variables on the record.
Example:
<recGrp><recDimnsn level='record'><varQnty>27</varQnty>
</recDimnsn></recGrp>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Record Quantity (of Record)
- <caseQnty> 3.1.3.1.2.2
- Description: Number of records of this type.
Example:
<recGrp><recDimnsn><caseQnty>1011</caseQnty>
</recDimnsn></recGrp>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Logical Record Length (of Record)
- <logRecL> 3.1.3.1.2.3
- Description: Logical record length of record, i.e., number of
characters of data in the record.
Example:
<recGrp><recDimnsn><logRecL>27</logRecL>
</recDimnsn></recGrp>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Notes
- <notes> 3.1.3.2 (Generic element A.4)
- Description: Indicate any additional
information regarding this record type. "Notes" sections appear in
several places in the DTD. The attributes for notes permit a
controlled vocabulary to be developed (type and subject), the level of
the DTD to which the note refers to be identified (study, file,
variable, etc.), and the author of the note to be indicated
(resp).
Example:
<notes>The number of arrest records for an individual is
dependent on the number of arrests an offender had.</notes>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type, subject, level, responsibility
- Contains: #PCDATA, Link to other element(s) within
the codebook, reference to a table.
- File Dimensions
- <dimensns> 3.1.4
- Description: Dimensions of the overall file.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
Overall Case Count,
Overall Variable Count,
Logical Record Length,
Records Per Case,
Total Number of Records
- Overall Case Count
- <caseQnty> 3.1.4.1
- Description: Number of cases or observations
in the entire file.
Remarks: To be used for rectangular files only.
Example:
<dimensns><caseQnty>205</caseQnty></dimensns>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Overall Variable Count
- <varQnty> 3.1.4.2
- Description: Number of variables in
the entire file.
Remarks: To be used for rectangular files only.
Example:
<dimensns><varQnty>88</varQnty></dimensns>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Logical Record Length
- <logRecL> 3.1.4.3
- Description: Logical record length of the file,
i.e., number of
characters.
Remarks:
To be used for rectangular files or if all records in a hierarchical file
are the same length.
Example:
<dimensns><logRecL>125</logRecL></dimensns>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Records per Case
- <recPrCas> 3.1.4.4
- Description: Records per case in the file.
Remarks: To be used
for card-image data or other files in which there are multiple records
per case.
Example:
<dimensns><recPrCas>5</recPrCas></dimensns>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Total Number of Records
- <recNumTot> 3.1.4.5
- Description: Overall record count in the
file.
Remarks: To be used in instances such as files with multiple cards/decks or
records per case.
Example:
<dimensns>recNumTot>2400</recNumTot></dimensns>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Type of File
- <fileType> 3.1.5
- Description: Types of data files include raw data
(ASCII, EBCDIC, etc.) and software-dependent files such as SAS datasets,
SPSS export files, etc. If the data are of mixed types (e.g., ASCII and
packed decimal), state that here.
The "charset" attribute allows one to sepcify the character set used in the
file, e.g., US-ASCII, EBCDIC, UNICODE UTF-8, etc.
Remarks: Note that the element Variable Format
(4.2.23) permits specification of the data format at the variable level.
Example:
<fileType charset='us-ascii'>ASCII data file</fileType>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, charset
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Data Format
- <format> 3.1.6
- Description: Physical format of the data file: Logical
record length format, card-image format (i.e., data with multiple records
per case), delimited format, free format, etc.
Example:
<format>comma-delimited</format>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Place of File Production
- <filePlac> 3.1.7
- Description: Indicate whether file was produced at an
archive or produced elsewhere.
<filePlac>Washington, DC: United States Department of Commerce, Bureau of the
Census</filePlace>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Extent of Processing Checks
- <dataChck> 3.1.8
- Description: Indicate here at the file level the types
of checks and operations performed on the data file. A controlled vocabulary
may be developed for this element in the future. The following examples
are based on ICPSR's Extent of Processing scheme:
Examples:
<dataChck>The archive produced a codebook for this collection.</dataChck>
<dataChck>Consistency checks were performed by Data Producer/ Principal
Investigator.</dataChck>
<dataChck>Consistency checks performed by the archive.</dataChck>
<dataChck>The archive generated SAS and/or SPSS data definition
statements for this collection.</dataChck>
<dataChck>Frequencies were provided by Data Producer/Principal Investigator.</dataChck>
<dataChck>Frequencies provided by the archive.</dataChck>
<dataChck>Missing data codes were standardized by Data
Producer/ Principal Investigator.</dataChck>
<dataChck>Missing data codes were standardized by the archive.</dataChck>
<dataChck>The archive performed recodes and/or calculated derived variables.
</dataChck>
<dataChck>Data were reformatted by the archive.</dataChck>
<dataChck>Checks for undocumented codes were performed by
Data Producer/Principal Investigator.</dataChck>
<dataChck>Checks for undocumented codes were performed by the archive.</dataChck>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Processing Status
- <ProcStat> 3.1.9
- Description: Processing status of the file.
Some data producers and social science data
archives employ data processing strategies that provide for release of data
and documentation at various stages of processing.
Examples:
<ProcStat>Available from the DDA. Being processed.</ProcStat>
<ProcStat>The principal investigator notes that the data in Public Use Tape 5 are released prior to
final cleaning and editing, in order to provide prompt access to the NMES data by the research and policy
community.</ProcStat>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Missing Data
- <dataMsng> 3.1.10
- Description: This element can be used to give general
information about missing data, e.g., that missing data have been
standardized across the collection, missing data are present because of
merging, etc.
Examples: <dataMsng>Missing data are represented by
blanks.</dataMsng>
<dataMsng>The codes "-1" and "-2"
are used to represent missing data.</dataMsng>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Software Used to Produce the File
- <software> 3.1.11 (Generic element A.6.3.5)
- Description: Software that created the file. A
"version" attribute permits specification of the software version
number. The "date" attribute is provided to enable specification of
the date (if any) of the software release. The ISO
standard for dates (YYYY-MM-DD) is recommended for use with the date
attribute.
Example:
<software version='6.12'>The SAS transport file
was generated by the SAS CPORT procedure.</software>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, date, version
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Version (of File) Statement
- <verStmt> 3.1.12 (Generic element A.6.6)
- Description: Version statement for the data file, if one of a
multi-file collection.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
Version,
Version Responsibility Statement,
Notes
- Version
- <version> 3.1.12.1 (Generic element A.6.6.1)
- Description: Also known as release or edition.
If there have been substantive changes in the file since its creation,
this statement should be used. The ISO standard for dates
(YYYY-MM-DD) is recommended for use with the date attribute.
Example:
<version type='edition' date='1999-02-05'>First ICPSR Edition</version>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type, date
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Version Responsibility Statement
- <verResp> 3.1.12.2 (Generic element A.6.6.2)
- Description: Used to indicate the
organization or person responsible for the version of the file.
Example:
<verResp>Inter-university Consortium for Political and Social
Research</verResp>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, affiliation
- Contains: #PCDATA, Link to other element(s) within the codebook. (Or alternatively, all of the elements available under the general responsibility statement above.)
- Notes
- <notes> 3.1.12.3 (Generic element A.4)
- Description: Used to indicate additional information
regarding the version or version responsibility statement, in particular to
indicate what makes a new version different from its predecessor.
"Notes" sections appear in several places in
the DTD. The attributes for notes permit a controlled vocabulary to be
developed (type and subject), the level of the DTD to which the note
refers to be identified (study, file, variable, etc.), and the author
of the note to be indicated (resp).
Example:
<notes>Data for all previously-embargoed variables are now available in
this version of the file.</notes>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type, subject, level, resp
- Contains: #PCDATA, Link to other
element(s) within the codebook, reference to a table.
- Notes
- <notes> 3.2 (Generic element A.4)
- Description: Additional information about the data file not
covered in other elements. "Notes" sections appear in several places in
the DTD. The attributes for notes permit a controlled vocabulary to be
developed (type and subject), the level of the DTD to which the note
refers to be identified (study, file, variable, etc.), and the author
of the note to be indicated (resp).
Example:
<notes>There is a restricted version of this file containing confidential information,
access to which is controlled by the principal investigator.</notes>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type, subject, level, responsibility
- Contains: #PCDATA, Link to other element(s) within
the codebook, reference to a table.
Variable Description
Document
|
|---Document Description
|---Study Description
|---Data Files Description
|---VARIABLES DESCRIPTION
|---Other Study-Related Materials
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements:
- Variable Group
- Variable
- Variable Group
- <varGrp> 4.1
- Description: A group of variables that may share a common subject,
arise from the interpretation of a single question, or are linked by some
other factor.
- The "type" of group attribute refers to the general type of
grouping of the variables, e.g., subject, multiple response.
- The "var" reference is used to indicate all the
constituent variable IDs in the group.
- The "varGrp" reference is used to
indicate all the subsidiary variable groups which nest underneath the
current varGrp. This allows for encoding of a hierarchical structure of
variable groups.
- The "name" is the unique ID for the group.
- The "sdatrefs" are summary data description references that
record the ID values of all elements within the summary data description
section of the Study Description that might apply to the group. These elements
include: time period covered, date of collection, nation or country,
geographic coverage, geographic unit, unit of analysis, universe, and kind
of data.
- The "methrefs" are methodology and processing references
which record the ID values of all elements within the study methodology and
processing section of the Study Description which might apply to the
group. These elements include information on data collection and data
appraisal (e.g., sampling, sources, weighting, data cleaning, response rates,
and sampling error estimates).
- The "pubrefs" attribute provides a link to
publication/citation references and records the ID values of all citations
elements within Section 2.5 or Section 5.0 that pertain to this variable group.
- "Access" records the ID values of all elements in Section 2.4
of the document that describe access conditions for this variable group.
- Remarks: Variable groups are created this way
in order to permit variables to belong to multiple groups, including
multiple subject groups such as a group of variables on sex and
income, or to a subject and a multiple response group, without causing
overlapping groups. Variables that are linked by use of the same
question need not be identified by a Variable Group element because
they are linked by a common unique question identifier in the Variable
element. Note that as a result of the strict sequencing required by XML,
all Variable Groups must be marked up before the Variable element is opened.
That is, the mark-up author cannot mark up a Variable Group, then mark up
its constituent variables, then mark up another Variable Group.
Specific variable groups, included within the 'type' attribute, are:
- Section: Questions which derive from the same section of the questionnaire,
e.g., all variables located in Section C.
- Multiple response: Questions where the respondent has the opportunity
to select more than one answer from a variety of choices, e.g., what
newspapers have you read in the past month (with the respondent able to
select up to five choices).
- Grid: Sub-questions of an introductory or main question but which do
not constitute a multiple response group, e.g., I am going to read you some
events in the news lately and you tell me for each one whether you are very
interested in the event, fairly interested in the fact, or not interested
in the event.
- Display: Questions which appear on the same interview screen (CAI)
together or are presented to the interviewer or respondent as a group.
- Repetition: The same variable (or group of variables) which are
repeated for different groups of respondents or for the same respondent
at a different time.
- Subject: Questions which address a common topic or subject, e.g.,
income, poverty, children.
- Version: Variables, often appearing in pairs, which represent different
aspects of the same question, e.g., pairs of variables (or groups) which are
adjusted/unadjusted for inflation or season or whatever, pairs of variables
with/without missing data imputed, and versions of the same basic question.
- Iteration: Questions that appear in different sections of the data
file measuring a common subject in different ways, e.g., a set of
variables which report the progression of respondent income over the life
course.
- Analysis: Variables combined into the same index, e.g., the
components of a calculation, such as the numerator and the denominator of an
economic statistic.
- Pragmatic: A variable group without shared properties.
- Record: Variable from a single record in a hierarchical file.
- File: Variable from a single file in a multifile study.
- Randomized: Variables generated by CAI surveys produced by one or more
random number variables together with a response variable, e.g, random
variable X which could equal 1 or 2 (at random) which in turn would control
whether Q.23 is worded "men" or "women", e.g., would you favor helping
[men/women] laid off from a factory obtain training for a new job?
- Other: Variables which do not fit easily into any of the categories
listed above, e.g., a group of variables whose documentation is in
another language.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, type, var, varGrp, name, sdatrefs,
methrefs, pubrefs, access
- Contains Elements: Variable Group Label,
Variable Group Text,
Variable Group Definition,
Variable Group Universe,
Variable Group Notes
- Variable Group Label
- <labl> 4.1.1 (Generic element A.2)
- Description: A short description of the
variable group. A "level" attribute is included to permit coding of
the level to which the label applies, i.e., the study level, the file
level (if different from study), the record group, the variable group,
or the variable level. Vendor attribute provided to allow for
specification of different labels for use with different vendors'
software.
- Examples:
<varGrp><labl>Study Procedure
Information</labl></varGrp>
<varGrp><labl>Political Involvement
and National Goals</labl></varGrp>
<varGrp><labl> level='record'>Household Variable Section
</labl></varGrp>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, level, vendor
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Variable Group Text
- <txt> 4.1.2 (Generic element A.3)
- Description: Lengthier description of variable group.
A "level" attribute is included to
permit coding of the level to which the text applies, i.e., the study
level, the file level (if different from study), the record group,
the variable group, or the variable level.
- Example:
<varGrp type='subject'><txt>The following five variables refer
to respondent attitudes toward national environmental
policies: air pollution, urban sprawl, noise abatement,
carbon dioxide emissions, and nuclear waste.</txt></varGrp>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, level
- Contains: #PCDATA, Link to other element(s) within the codebook,
reference to a table. (Optional
references to individual variable IDs within the text.)
- Variable Group Definition
- <defntn> 4.1.3
- Description: Rationale for why the variables are grouped in
this way.
- Example:
<varGrp><defntn>The following eight variables
were only asked in Ghana.</defntn></varGrp>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Variable Group Universe
- <universe> 4.1.4 (Reference element 2.2.3.7)
- Description: The group of persons or other elements that are the
object of the variable group and to which any analytic results refer.
Age, nationality, and residence commonly help to delineate a given
universe, but any of a number of factors may be involved, such as sex,
race, income, veteran status, criminal convictions, etc. The universe
may consist of elements other than persons, such as housing units,
court cases, deaths, countries, etc. In general, it should be possible
to tell from the description of the universe whether a given
individual or element (hypothetical or real) is a member of the
population under study. A "level" attribute is included to
permit coding of the level to which universe applies, i.e., the study
level, the file level (if different from study), the record group, the
variable group, or the variable level.
The "clusion" attribute provides for specification of groups included (I) in
or excluded (E) from the universe.
- Remarks: If all the variables described in the data documentation relate
to the same population, e.g., the same set of survey respondents, this
element and its complement at the variable level (Variable Universe
4.2.12) would be unnecessary. In this case, universe can be fully described
at the level of the study (2.2.3.7).
- Examples:
<varGrp><universe clusion='I'>Individuals 15-19 years of age.
</universe></varGrp>
<varGrp><universe clusion='E'>Individuals younger than 15 and
older than 19 years of age.</universe></varGrp>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, level, clusion
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Variable Group Notes
- <notes> 4.1.5 (Generic element A.4)
- Description: Used to indicate additional information about the
variable group. "Notes" sections appear in several places in
the DTD. The attributes for notes permit a controlled vocabulary to be
developed (type and subject), the level of the DTD to which the note
refers to be identified (study, file, variable, etc.), and the author
of the note to be indicated (resp).
- Examples:
<varGrp><notes>This variable group was created for the purpose of
combining all derived variables.</notes></varGrp>
<varGrp><notes source='archive' resp='John Data'>This variable
group and all other variable groups in this data file were organized
according to a schema developed by the adhoc advisory committee.
</notes></varGrp>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type, subject, level, resp
- Contains: #PCDATA, Link to other element(s) within the codebook,
reference to a table.
- Variable
- <var> 4.2
- Description: This element describes all of the features of a single
variable in a social science data file. This element includes
the following attributes:
- The attribute "name" is a unique ID for
the variable. Following the rules of many statistical analysis systems
such as SAS and SPSS, names are usually up to eight
characters long.
- "Wgt" indicates whether the variable is a weight.
- "Wgt-var" is a reference to the weight variable for this variable.
- "Qstn" is a reference to the question ID for the variable.
- "Files" is the IDREF identifying the file(s) to which the variable belongs.
- "Vendor" is the origin of the proprietary format and includes SAS,
SPSS, ANSI, and ISO.
- "Dcml" refers to the number of decimal points in the variable.
- "Intrvl" (interval) type options are discrete or continuous.
- "Rectype" refers to the record type to which the variable belongs.
- The "sdatrefs" are summary data description references which
record the ID values of all elements within the summary data
description section of the Study Description
which might apply to the group. These elements
include: time period covered, date of collection, nation or
country, geographic coverage, geographic unit, unit of
analysis, universe, and kind of data.
- The "methrefs" are methodology and processing references
which record the ID values of all elements within the study
methodology and processing section of the Study Description
which might apply to the
group. These elements include information on data collection
and data appraisal (e.g., sampling, sources, weighting, data
cleaning, response rates, and sampling error estimates).
- The "pubrefs" attribute provides a link to publication/citation references
and records the ID values of all citations elements within Section 2.5
or Section 5.0 that pertain to this variable.
- "Access" records the ID values of all elements in Section 2.4
of the document that describe access conditions for this variable.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, name, wgt, wgt-var, qstn, files,
vendor, dcml, intrvl, rectype, sdatrefs, methrefs, pubrefs, access
- Contains Elements: Location,
Label,
Imputation,
Security,
Embargo,
Response Unit,
Analysis Unit,
Question,
Range of Valid Data Values,
Range of Invalid Data Values,
Undocumented Codes,
Universe,
Total Responses,
Summary Statistics,
Variable Text,
Standard Categories,
Category Group,
Category,
Coding Instructions,
Version (of Variable) Statement,
Concept,
Derivation,
Variable Format,
Notes
- Location
- <location> 4.2.1
- Description: This is an empty element
containing only the attributes listed below. Attributes include
"StartPos" (starting position of variable), "EndPos" (ending position
of variable), "width" (number of columns the variable occupies),
"RecSegNo" (the record segment number, deck or card number the
variable is located on), and "fileid" (an IDREF link to the fileDscr
element for the file that this location is within).
Remarks: The fileid is necessary
in cases where the same variable may be coded in two different files,
e.g., a logical record length type file and a card image type file.
Note that if there is no width or
ending position, then the starting position should be the ordinal
position in the file, and the file would be described as
free-format.
- Examples:
<var><location StartPos='55'
EndPos='57' width='3' RecSegNo='1' fileid='CARD-IMAGE'
></location></var>
<var files='File2'><location StartPos='25'
EndPos='25' width='1' RecSegNo='A'></location></var>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, StartPos, EndPos, width, RecSegNo,
fileid
- Empty element.
- Variable Label
- <labl> 4.2.2 (Generic element A.2)
- Description: A descriptive phrase which defines the variable.
The length of this phrase may depend on the statistical analysis
system used (e.g., some version of SAS permit 40-character labels
while some versions of SPSS permit 120 characters. A "level" attribute
is included to permit coding of the level to which label applies,
i.e., the study level, the file level (if different from study), the
record group, the variable group, or the variable level. Vendor attribute
provided to allow for specification of different labels for use with
different vendors' software.
- Remarks:Whenever possible this element should be used instead of
4.2.15 (Variable Text, 'txt' ) in order to facilitate the creation of
statistical analysis software labels.
- Example:
<var><labl>Why No Holiday-No Money</labl></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, level, vendor
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Imputation
- <imputation> 4.2.3
- Description: According to the
Statistical Terminology glossary maintained by the National Science
Foundation, this is "the process by which one estimates missing values
for items that a survey respondent failed to provide," and if
applicable in this context, it refers to the type of procedure used.
- Example:
<var><imputation>This variable contains values that
were derived by substitution.</imputation></var>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Security
- <security> 4.2.4
- Description: Provides information regarding levels of access
to the variable, e.g., public, subscriber, need to know. The ISO standard for dates
(YYYY-MM-DD) is recommended for use with the date attribute.
- Example:
<var><security date='1998-05-10'> This variable has
been recoded for reasons
of confidentiality. Users should contact the archive for
information on obtaining access.</security></var>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, date
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Embargo
- <embargo>
4.2.5
- Description: Provides information on variables which are
not currently available because of policies established by the
principal investigators and/or data producers. The ISO standard for
dates (YYYY-MM-DD) is recommended for use with the date attribute. An
"event" attribute is provided to specify "notBefore" or "notAfter"
("notBefore" is the default). A "format" attribute is provided to
ensure that this information will be machine-processable and specifies
a format for the embargo element.
The format attribute could be used to specify other
conventions for the way that information within the embargo element is
set out, if there were agreed-upon, commonly used conventions for
encoding embargo information created in the future.
-
Example:
<var><embargo event='notBefore'
date='2001-09-30'> This data associated with this variable will not
become available until September 30, 2001, because of embargo
provisions established by the data producers.
</embargo></var> - Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, date, event, format
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Response Unit
- <respUnit> 4.2.6
- Description: Provides information regarding who provided
the information contained within the variable, e.g., respondent,
proxy, interviewer.
- Example:
<var><respUnit> Respondent
</respUnit></var>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Analysis Unit
- <anlysUnt> 4.2.7
- Description: Provides information regarding whom or what
the variable describes.
- Example:
<var><anlysUnt> This variable reports election returns at the
constituency level.
</anlysUnt></var>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Question
- <qstn> 4.2.8
- Description: The question element may have
mixed content. The element itself may contain
text for the question, with the subelements being used to provide further
information about the question. Alternatively, the question element may be
empty and only the subelements used. The element has a unique question ID
attribute which can be used to link a variable with other variables
where the same question has been asked. This would allow searching
for all variables that share the same question ID perhaps because the
questions was asked several times in a panel design.
The attributes for this element include:
- a "qstn" ID, a unique identifier for the question
- "Var", a reference to IDs of all variables relating to question
- "seqNo", the sequence number of the question, and
- "sdatrefs", summary data description references which
record the ID values of all elements within the summary data
description section of the Study Description
which might apply to the group. These elements
include: time period covered, date of collection, nation or
country, geographic coverage, geographic unit, unit of
analysis, universe, and kind of data.
- Example:
<var><qstn ID='Q125'>When you get together with your
friends, would you say you discuss political matters
frequently, occasionally, or never?</qstn></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, qstn, var, seqNo, sdatrefs
- May include mixed #PCDATA content, Link to other element(s)
within the codebook.
- Contains Elements: Pre-Question Text,
Literal Question,
Post-Question Text,
Forward Progression,
Back Flow,
Interviewer Instructions
- Pre-Question Text
- <preQTxt> 4.2.8.1
- Description: Text describing a set of conditions under which
a question might be asked.
- Example:
<var><qstn><preQTxt>For those who did not go
away on a holiday of four days or more in 1985...
</preQTxt></qstn></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Literal Question
- <qstnLit> 4.2.8.2
- Description: Text of the actual, literal question asked.
- Example:
<var><qstn><qstnLit>Why didn't you go away in
1985?</qstnLit></qstn></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Post-Question Text
- <postQTxt> 4.2.8.3
- Description: Text describing what occurs after the literal
question has been asked.
- Example:
<var><qstn><postQTxt>The next set of questions
will ask about your financial situation.</postQTxt>
</qstn></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Forward Progression
- <forward> 4.2.8.4
- Description: Contains a reference to
IDs of possible following questions. The "qstn" IDREF may be used to
specify the IDs.
- Example:
<var><qstn><forward qstn='Q120 Q121 Q122 Q123 Q124'>
If yes, please ask
questions 120-124.</forward></qstn></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, qstn
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Backflow
- <backward> 4.2.8.5
- Description: Contains a reference to
IDs of possible preceding questions. The "qstn" IDREF may be used to
specify the IDs.
- Examples:
<var><qstn><backward qstn='Q12 Q13 Q14 Q15'>For responses
on a similar topic, see questions 12-15.</backward></qstn>
</var>
<var><qstn><backward qstn='Q143'>
</backward></qstn>
</var>
- Repeatable
- Attributes: ID, xml:lang, source, qstn
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Interviewer Instructions
- <ivuInstr> 4.2.8.6
- Description: Specific instructions to the individual conducting an
interview.
- Example:
<var><qstn><ivuInstr> Please prompt the
respondent if they are reticent to answer this question.
</ivuInstr></qstn></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Range of Valid Data Values
- <valrng> 4.2.9
- Description: Values for a particular
variable that represent legitimate responses.
- Example:
<valrng>
<range UNITS='INT' maxExclusive='95' min='05' max='80'>
</range>
<key>
05 (PSU) Parti Socialiste Unifie et extreme gauche (Lutte Ouvriere)
[United Socialists and extreme left (Workers Struggle)]
50 Les Verts [Green Party]
80 (FN) Front National et extreme droite [National Front and extreme right]
95 Would vote blank
</key>
</valrng>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements: Variable Range,
Variable Item,
Range Key,
Notes
- Variable Range
- <range> 4.2.9 (Generic element A.8)
- Description: This is the actual
range. The "UNITS" attribute of Range permits the specification of
integer/real numbers. The "min" and "max attributes specify values
which are considered part of the range. The "minExclusive" and "maxExclusive"
attributes specify values which are not considered part of the range.
For example, x < 1 or 10 <= x < 20
would be expressed as <range maxExclusive='1' /><range
min='10' maxExclusive='20' />. This is an empty element consisting
only of its attributes.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, UNITS, min, minExclusive, max,
maxExclusive
- Empty element.
- Variable Item
- <item> 4.2.9 (Generic element A.9)
- Description: The counterpart to Range; used to encode individual
values. This is an empty element consisting only of its attributes.
The "UNITS" attribute of Range permits the specification of integer/real
numbers.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, UNITS, VALUE
- Empty element.
- Range Key
- <key> 4.2.9 (Generic element A.10)
- Description: This element permits a listing of the category
values and labels. While this information is coded separately in the
Category element, there may be some value in having this information
in proximity to the range of valid and invalid values.
A table is permissible in this element.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook,
reference to a table.
- Notes
- <notes> 4.2.9 (Generic element A.4)
- Description: Used to indicate additional information regarding
the variable range. "Notes" sections appear in
several places in the DTD. The attributes for notes permit a
controlled vocabulary to be developed (type and subject), the level of
the DTD to which the note refers to be identified (study, file,
variable, etc.), and the author of the note to be indicated
(resp).
- Example:
<valrng><
notes subject='political party' >Starting with
Euro-Barometer 2 the coding of this variable
has been standardized following an approximate ordering of
each country's political parties along a "left" to "right"
continuum in the first digit of the codes. Parties coded
01-39 are generally considered on the "left", those coded
40-49 in the "center", and those coded 60-89 on the "right"
of the political spectrum. Parties coded 50-59 cannot be
readily located in the traditional meaning of "left" and
"right". The second digit of the codes is not significant
to the "left-right" ordering. Codes 90-99 contain the
response "other party" and various missing data responses.
Users may modify these codings or part of these codings
in order to suit their specific needs.
</notes></valrng>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type, subject, level, resp
- Contains: #PCDATA, Link to other element(s) within the codebook,
reference to a table.
- Range of Invalid Data Values
- <invalrng> 4.2.10
- Description: Values for a particular variable
that represent missing data, not applicable responses, etc.
- Example:
<invalrng>
<range UNITS='INT' minExclusive='0' min='98' max='99'>
</range>
<key>
0 No answer
98 DK
99 Inappropriate
</key>
</invalrng>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements: Variable Range,
Variable Item,
Range Key,
Notes
- Variable Range
- <range> 4.2.10 (Generic element A.8)
- Description: This is the actual
range. The "UNITS" attribute of Range permits the specification of
integer/real numbers. For example, x < 1 or 10 <= x < 20
would be expressed as <range maxExclusive='1' /><range
min='10' maxExclusive='20' />.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, UNITS, min, minExclusive, max,
maxExclusive
- Empty element.
- Variable Item
- <item> 4.2.10 (Generic element A.9)
- Description: The counterpart to Range; used to encode individual
values. This is an empty element consisting only of its attributes.
The "UNITS" attribute of Range permits the specification of integer/real
numbers.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, UNITS, VALUE
- Empty element.
- Range Key
- <key> 4.2.10 (Generic element A.10)
- Description: This element permits a listing of the category
values and labels. While this information is coded separately in the
Category element, there may be some value in having this information
in proximity to the range of valid and invalid values.
A table is permissible in this element.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook,
reference to a table.
- Notes
- <notes> 4.2.10 (Generic element A.4)
- Description: Used to indicate additional information regarding
the variable range. "Notes" sections appear in
several places in the DTD. The attributes for notes permit a
controlled vocabulary to be developed (type and subject), the level of
the DTD to which the note refers to be identified (study, file,
variable, etc.), and the author of the note to be indicated
(resp).
- Example:
<invalrng><notes>Codes 90-99 contain the
response "other party" and various missing data responses.
</notes></invalrng>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type, subject, level, resp
- Contains: #PCDATA, Link to other element(s) within the codebook,
reference to a table.
- Undocumented Codes
- <undocCod> 4.2.11
- Description: Values whose meaning is unknown.
- Example:
<var><undocCod>Responses for categories 9 and 10 are
unavailable.</undocCod></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Universe
- <universe> 4.2.12 (Reference element 2.2.3.7)
- Description: The group of persons or other elements that are the
object of the variable and to which any analytic results refer.
Age, nationality, and residence commonly help to delineate a given
universe, but any of a number of factors may be involved, such as sex,
race, income, veteran status, criminal convictions, etc. The universe
may consist of elements other than persons, such as housing units,
court cases, deaths, countries, etc. In general, it should be possible
to tell from the description of the universe whether a given
individual or element (hypothetical or real) is a member of the
population under study. A "level" attribute is included to
permit coding of the level to which universe applies, i.e., the study
level, the file level (if different from study), the record group,
the variable group, or the variable level.
The "clusion" attribute provides for specification of groups included (I) in
or excluded (E) from the universe.
- Examples:
<var><universe clusion='I'>Individuals
15-19 years of age.
</universe></var>
<var><universe clusion='E'>Individuals younger than 15 and
older than 19 years of age.</universe></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, level, clusion
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Total Responses
- <TotlResp> 4.2.13
- Description: The number of responses to this variable.
This element might be used if the number of responses does not match added case
counts. It may also be used to sum the frequencies for variable categories.
- Example:
<var><TotlResp>There are only 725 responses to this
question since it was not asked in Tanzania.</TotlResp></var>
- Optional
- Not repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Summary Statistics
- <sumStat> 4.2.14
- Description: One or more statistical
measures which describe the responses to a particular variable and may
include one or more standard summaries, e.g., minimum and
maximum values, etc. This variable includes the following attributes:
- "Wgtd" refers to whether weighted or not.
- "Weight" is the name of weight variable if one is used.
- "Statistic type" can denote mean, median, mode, valid
cases, invalid cases, minimum, maximum, or standard deviation.
- Examples:
<var><sumStat type='min'>0</sumStat></var>
<var><sumStat type='max'>9</sumStat></var>
<var><sumStat type='median'>4</sumStat></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, wgtd, weight, type
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Variable Text
- <txt> 4.2.15 (Generic element A.3)
- Description: An extended description, beyond that
provided in Variable Name and
Label, of the variable. A "level" attribute is included to
permit coding of the level to which the text applies, i.e., the study
level, the file level (if different from study), the record group, the
variable group, or the variable level.
- Example:
<var><txt>Support for European Economic Community
Index - constructed from Q. 246 and Q. 248.</txt></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, level
- Contains: #PCDATA, Link to other element(s) within the codebook,
reference to a table.
- Standard Categories
- <stdCatgry> 4.2.16
- Description: Standard category group used in a variable,
like industry codes,
employment codes, or social class codes. The attribute of "date" is
provided to indicate the version of the code in place at the time of the
study. The attribute of "URI" is provided to indicate a URN or URL that can
be used to obtain the electronic form of the category group.
- Example:
<var><stdCatgry date='1981' source='producer' >Census
of Population, Classified Index of Industries and Occupations
</stdCatgry></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, date, URI
- Contains: #PCDATA, Link to other element(s) within the codebook
- Category Group
- <catgryGrp> 4.2.17
- Description: A description of response categories that might
be grouped together. The attribute "missing" indicates whether
this category group
contains missing data or not. The attribute "missType" is used to specify
the type of missing data, e.g., inap., don't know, no answer, etc.
A controlled vocabulary for "missType" will be developed in the future.
The "catgry" attribute permits specification of constituent categories in
the group. The "catGrp" attribute is used to indicate all the
subsidiary category groups which may nest underneath the current
category group, thereby permitting the encoding of hierarchical structures
of category groups.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, missing, missType, catgry, catGrp
- Contains Elements: Category Group Label,
Category Group Text
- Category Group Label
- <labl> 4.2.17.1 (Generic element A.2)
- Description: A short description of the category group. A "level"
attribute is included to permit coding of the level to which the label
applies, i.e., the study level, the file level (if different from
study), the record group, the variable group, or the variable
level. Vendor attribute provided to allow for specification of
different labels for use with different vendors' software.
- Example:
<var><catgryGrp missing='N' catgry='supervisors, farm workers;
farm workers; marine life cultivation workers; nursery workers; animal
caretakers, except farm; timber cutting and logging occupations; hunters
and trappers'
catGrp='Farm occupations, except managerial'
><labl>Other Agricultural and Related Occupations
</labl></catgryGrp></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, level, vendor
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Category Group Text
- <txt> 4.2.17.2 (Generic element A.3)
- Description: A fuller description of the category group.
A "level" attribute is included to
permit coding of the level to which the text applies, i.e., the study
level, the file level (if different from study), the record group,
the variable group, or the variable level.
- Example:
<var><catgryGrp><txt>When the respondent
indicated his political party reference, his response
was coded on a scale of 1-99 with parties with a left-wing
orientation coded on the low end of the scale and parties
with a right-wing orientation coded on the high end of
the scale. Categories 90-99 were reserved
miscellaneous responses.</txt></catgryGrp></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, level
- Contains: #PCDATA, Link to other element(s) within the codebook,
reference to a table.
- Category
- <catgry> 4.2.18
- Description: A description of a particular response.
The attribute "missing" indicates whether this category group
contains missing data or not. The attribute "missType" is used to specify
the type of missing data, e.g., inap., don't know, no answer, etc. A
controlled vocabulary for "missType" will be developed in the future.
The attribute "country" allows for the denotation of country-specific
category values. Users should employ the ISO3166 standard for the designation
of country codes.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, missing, missType, country
- Contains Elements: Category Value,
Category Label,
Category Text,
Category Statistic,
- Category Value
- <catValu> 4.2.18.1
- Description: The explicit response.
- Example:
<var><catgryGrp><catgry missing='Y'
missType='inap'><catValu>9
</catValu> </catgry></catgryGrp></var>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Category Label
- <labl> 4.2.18.2 (Generic element A.2)
- Description: A short description of the response.
A "level" attribute is included to
permit coding of the level to which the text applies, i.e., the study
level, the file level (if different from study), the
record group, the variable group,
or the variable level. Vendor attribute
provided to allow for specification of different labels for use with
different vendors' software.
- Remarks:Whenever possible this element should be used instead of
4.2.18.3 (Category Text, 'txt' ) in order to facilitate the creation of
statistical analysis software labels.
- Examples:
<var><catgryGrp><catgry><labl>Better</labl>
</catgry></catgryGrp></var>
<var><catgryGrp><catgry><labl>About the
same</labl>
</catgry></catgryGrp></var>
<var><catgryGrp><catgry><labl>Inap.</labl>
</catgry></catgryGrp></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, level, vendor
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Category Text
- <txt> 4.2.18.3 (Generic element A.3)
- A fuller description of the response or an elaboration on the
response. A "level" attribute is included to
permit coding of the level to which the text applies, i.e., the study
level, the file level (if different from study), the record group,
the variable group,
or the variable level.
- Example:
<var><catgryGrp><catgry><txt>Inap.,
question not asked in Ireland, Northern Ireland,
and Luxembourg.</txt></catgry></catgryGrp></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, level
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Category Statistic
- <catStat> 4.2.18.4
- Description: May include frequencies, percentages, or
crosstabulation results which define the category;
often appears in a table. The attribute "type" refers to "frequency",
"percent", or "crosstab". The URI attribute can be used to link to a
table.
- Example:
<var><catgryGrp><catgry><catStat type='freq'>256
</catStat></catgry></catgryGrp></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, type, URI
- Contains: #PCDATA, Link to other element(s) within the codebook,
reference to a table.
- Coding Instructions
- <codInstr> 4.2.19
- Description: Any special instructions to
those who converted information
from one form to another for a particular variable. This
might include the reordering of numeric information into
another form or the conversion of textual information into
numeric information.
- Example:
<var><codInstr>Use the standard classification
tables to present responses to the question: What is your
occupation? into numeric codes.</codInstr></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA or a table.
- Version (of Variable) Statement
- <verStmt> 4.2.20 (Generic element A.6.6)
- Description: Version statement for the variable, if it has
undergone changes.
- Optional
- Repeatable
- Attributes: ID, xml:lang, source
- Contains Elements: Version,
Version Responsibility Statement,
Notes
- Version
- <version> 4.2.20 (Generic element A.6.6.1)
- Description: Also known as release or edition.
If there have been substantive changes in the variable since its
creation, this statement should be used. The ISO
standard for dates (YYYY-MM-DD) is recommended for use with the date
attribute.
- Example:
<var><verStmt><version type='version'
date='1999-01-25'>Second version of V25</version></verStmt>
</var>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type (release, version, edition), date
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Version Responsibility Statement
- <verResp> 4.2.20 (Generic element A.6.6.2)
- Description: Used to indicate the
organization or person responsible for the version of the variable.
- Example:
<var><verStmt><verResp>Zentralarchiv
fuer Empirische Sozialforschung</verResp></verStmt></var>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, affiliation
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Notes
- <notes> 4.2.20 (Generic element A.4)
- Used to indicate additional
information regarding the version or the version responsibility
statement, in particular to indicate what makes a new version of a
variable different from its predecessor. "Notes" sections appear in
several places in the DTD. The attributes for notes permit a
controlled vocabulary to be developed (type and subject), the level of
the DTD to which the note refers to be identified (study, file,
variable, etc.), and the author of the note to be indicated
(resp).
- Example:
<var><verStmt><notes>The labels for
categories 01 and 02 for this variable,
were inadvertently switched in the first version of this variable and have
now been corrected.</notes></verStmt></var>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type, subject, level, resp
- Contains: #PCDATA, Link to other element(s) within the codebook,
reference to a table.
- Concept
- <concept> 4.2.21
- Description: The general subject to which
this variable may be seen as pertaining.
This element serves the same purpose as the keywords and topic classification
elements, but at the variable level. The "vocab" attribute is provided to
indicate the controlled vocabulary, if any, used in the element, e.g.,
LCSH (Library of Congress Subject Headings), MeSH (Medical Subject Headings),
etc. The "vocabURI" attribute specifies the location for the
full controlled vocabulary.
- Remarks: The actual category reference should be included in the general
text.
- Examples:
<var><concept>Income</concept></var>
<var><concept vocab='LCSH' vocabURI=
'http://lcweb.loc.gov/catdir/cpso/lcco/lcco.html' source='archive'
>SF: 311-312 draft horses</concept></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, vocab, vocabURI
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Derivation
- <derivation> 4.2.22
- Description: Used only in the case of a derived
variable, this element provides
both a description of how the derivation was performed and the command used
to generate the derived variable, as well as a specification of the other
variables in the study used to generate the derivation. The "var" attribute
provides the ID values of the other variables in the study used to
generate this derived variable.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, var
- Contains Elements: Derivation Description,
Derivation Command
- Derivation Description
- <drvdesc> 4.2.22.1
- Description: A textual description of the way in which this
variable was derived to display to users.
- Examples:
<var><deriv><drvdesc>
VAR215.01 "Outcome of first pregnancy" (1988 NSFG=VAR611 PREGOUT1)
If R has never been pregnant (VAR203 PREGNUM EQ 0) then OUTCOM01 is
blank/inapplicable.
Else, OUTCOM01 is transferred from VAR225 OUTCOME for R's 1st pregnancy.
</drvdesc></deriv></var>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Derivation Command
- <drvcmd> 4.2.22.2
- Description: The actual command used to generate the
derived variable. The
"syntax" attribute is used to indicate the command language employed (e.g.,
SPSS, SAS, Fortran, etc.)
- Example:
<var><dervi><drvcmd><txt> syntax='SPSS'
>RECODE V1 TO V3 (0=1) (1=0) (2=-1) INTO DEFENSE WELFARE HEALTH.
</drvcmd></deriv></var>
- Optional
- Not Repeatable
- Attributes: ID,xml:lang, source, syntax
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Variable Format
- <varFormat> 4.2.23
- Description: The technical format of the variable in question.
Attributes for this element include: "type," which signifies if the variable
is character or numeric; "formatname," which in some cases may provide
the name of the particular, proprietary format actually used; "schema,"
which identifies the vendor or standards body which defined the format among
a list which includes SAS, SPSS, IBM, ANSI, ISO, XML-data or other;
"category," which describes what kind of data the format represents and
includes date, time, currency, or "other" conceptual possibilities; and
"URI," which supplies a network identifier for the format definition.
- Examples:
<var><varFormat type='numeric' schema='SAS' formatname='DATEw'
category=date
>The number in this variable is stored in the form 'ddmmmyy' in SAS format.
</varFormat></var>
<var>
<varFormat type='numeric' formatname='date.iso8601' schema='XML-Data'
category='date' URI='http://www.w3.org/TR/1998/NOTE-XML-data/'>
19541022
</varFormat>
</var>
- Optional
- Not Repeatable
- Attributes: ID,xml:lang, source, type, formatname, schema, category, URI
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Notes
- <notes> 4.2.24 (Generic element A.4)
- Description: Used to indicate additional information
regarding the variable. "Notes" sections appear in
several places in the DTD. The attributes for notes permit a
controlled vocabulary to be developed (type and subject), the level of
the DTD to which the note refers to be identified (study, file,
variable, etc.), and the author of the note to be indicated
(resp).
- Example:
<var><notes>This variable was created by recoding location
of residence to Census regions.</notes></var>
- Optional
- Repeatable
- Attributes: ID, xml:lang, source, type, subject, level, resp
- Contains: #PCDATA, Link to other element(s) within the codebook,
reference to a table.
Other Study-Related Materials
Document
|
|---Document Description
|---Study Description
|---Data Files Description
|---Variable Description
|---OTHER STUDY-RELATED MATERIALS
- This section allows for the inclusion of other materials that are
related to the study as identified and labeled by the DTD users
(encoders). The materials may be entered as PCDATA (ASCII text)
directly into the document (through use of the "txt" element).
This section may also serve as a
"container" for other machine-readable materials such as data
definition statements by providing a brief description of the
study-related materials accompanied by the attributes "type" and
"level" defining the material further. The "URI" attribute may be used
to indicate the location of the other
study-related materials.
- Other Study-Related Materials may include: questionnaires, coding
notes, SPSS/SAS/STATA setups (and others), user manuals, continuity
guides, sample computer software programs, glossaries of terms,
interviewer/project instructions, maps, database schema, data
dictionaries, show cards, coding information, interview schedules,
missing values information, frequency files, variable maps, etc.
- Note that Section 2.5, Other Study Description Materials, should be
used for materials that are primarily descriptions of the content and
use of the study, such as appendices, sampling information, weighting
details, methodological and technical details, publications based upon
the study content, related studies or collections of studies, etc.
This section, 5.0 Other Study-Related Materials, is intended to
include or to link to materials used in the production of the study or useful in the
analysis of the study.
- Other Study-Related Materials
- <otherMat> 5.0 (Generic element A.1)
-
Description: Other materials related to the study.
- Example:
<otherMat type='SAS data definition statements' level='study' URI='http://
www.icpsr.umich.edu'><labl>SAS Data Definition Statements for
ICPSR 6837</labl></otherMat>
- Optional
- Repeatable
- Attributes: ID, xml:lang,
source, type, level, URI
- Contains Elements:
- Label
- Text
- Notes
- Table
- Citation
- Label
- <labl> 5.1 (Generic element A.2)
- Description: Short description of the
other material. A "level" attribute is included to permit coding of
the level to which the label applies, i.e., the study level, the file
level (if different from study), the record group, the variable group,
or the variable level. Vendor attribute provided to allow for
specification of different labels for use with different vendors'
software.
- Example:
<otherMat type='SAS data definition statements' level='study' URI='http://
www.icpsr.umich.edu'><labl>SAS Data Definition Statements for
ICPSR 6837</labl></otherMat>
- Optional
- Repeatable
- Attributes: ID, xml:lang,
source, level, vendor
- Contains: #PCDATA, Link to other element(s) within the codebook.
- Text
- <txt> 5.2 (Generic element A.3)
- Description: Lengthier description of other material.
A "level" attribute is included to
permit coding of the level to which the text applies, i.e., the study
level, the file level (if different from study), the record group,
the variable group,
or the variable level.
- Example:
<otherMat URI="http://www.icpsr.umich.edu/.."><txt>This is a PDF version of the original questionnaire
provided by the principal investigator.</txt></otherMat>
<otherMat><txt>Glossary of Terms. Below are terms that may
prove useful in working with the technical documentation for this study..
</txt></otherMat>
- Optional
- Repeatable
- Attributes: ID, xml:lang,
source, level
- Contains: #PCDATA, Link to other element(s) within the codebook,
reference to a table.
- Notes
- <notes> 5.3 (Generic element A.4)
- Description: Used to indicate additional
information about the other material. "Notes" sections appear in several places in
the DTD. The attributes for notes permit a controlled vocabulary to be
developed (type and subject), the level of the DTD to which the note
refers to be identified (study, file, variable, etc.), and the author
of the note to be indicated (resp).
- Example:
<otherMat><txt>This is a PDF version of the original questionnaire
provided by the principal investigator.</txt>
<notes>Users should be aware that this questionnaire was modified
during the CAI process.</notes></otherMat>
- Optional
- Not Repeatable
- Attributes: ID, xml:lang, source, type, subject, level, responsibility
- Contains: #PCDATA, Link to other element(s) within
the codebook, reference to a table.
- Table
- <table> 5.4
- Description: Tables may be inserted in Section 5. In XML editor software, the
table capability will be activated in element 5.0. Machine-readable
frequency tables, for example, could be appended to the DDI document in
this section.
- Citation
- <citation> 5.5 (Generic element A.1)
- Description: The citation for the other material.
This element encodes the bibliographic
information describing the other material, including title
information, statement of responsibility, production and distribution
information, series and version information, text of a preferred
bibliographic citation, and notes (if any). It uses generic element
A.6, found at the end of the DTD. A MARCURI attribute is provided to link
to the MARC record for this citation.
- Optional
- Not Repeatable
- Attributes: ID, xml:lang,
source, MARCURI
- Contains Elements: The full tree for the citation
element is omitted for reasons of space. See Section 2.1, Citation of
Study.
- Other Study-Related Materials
- <otherMat> 5.6
Other materials related to the study. Note: This element (5.6) is
recursively defined to Other Material above (5.0).
|