The
first column is text from the actual DTD file. The second column
contains general explanation, with more details provided through the hyperlinks. |
<?xml
encoding="UTF-8"?>
|
All
XML documents must start with this line.
It tells the client application (in this case, the RDL
Data Viewer), what type of document it is, and what version of XML.
"encoding=UTF-8"
means that this document uses 8-byte Universal Text Format
(essentially, ASCII). Different encodings have been developed
to permit documents to be created in different character sets
(such as Chinese). Currently, RDL supports only UTF-8.
|
<!--*****************
RDL1.DTD
Basic DTD for RDL Data Viewer
Version: Version 1.0.01
Author: Russell T. Davis
Last Update: Sunday, October 25, 1999
******************* -->
|
Comments
in XML begin with "<!--" and end with "-->"
|
<!--
The root element: a whole portfolio of data is an "RDLdoc"
--> |
|
<!ELEMENT
rdldoc (rdldoc_header, line_item_set)> |
An
"rdldoc" consists of two objects: a header and a collection
of data. The header contains overall document metadata
(who prepared it, when it was made, etc.). The "line item
set" roughly corresponds to the concept of a table in a
database. It is a collection of "line items" (rows).
"rdldoc"
is the root element. In the RDL document, all elements
descend from this in a hierarchical fashion. The structure
of this tree is what is being defined in this DTD. At
the first level, an rdldoc tree has two branches: rdldoc_header
and line_item_set.
|
<!--
rdlDOC_HEADER -->
<!--
Information about the rdldoc. An rdldoc consists of an rdldoc_header
and a line_item_set. All of the line items in the line_item_set
share a common data structure.
-->>
|
|
<!ELEMENT
rdldoc_header (data_source?, formatting_source?, rdldoc_source?,
license_terms?, linkset?)> |
The
header has five optional branches, each describing some aspect
of the source. Note that even if you leave off all of these
sub-elements, you still have some header information contained
in the "attributes" of rdldoc_header. (See the ATTLIST below). |
<!ATTLIST
rdldoc_header
|
Every
element in a structured document can have two types of information
attached to it: sub-elements and attributes.
There are no hard-and-fast rules regarding whether to put information
in attributes or sub-elements; in general, RDL puts
commonly used, global metadata about the element in attributes,
and leaves distinct concepts to sub-elements.
|
<!ELEMENT
data_source (contact_info+)> |
|
<!ELEMENT
formatting_source (contact_info+)> |
|
<!ELEMENT
rdldoc_source (contact_info+)> |
|
<!ELEMENT
license_terms (contact_info?, linkset?)> |
|
<!ATTLIST
license_terms
|
|
<!ELEMENT
contact_info (#PCDATA)> |
This
element can be used by the target application to create an email
letter, or update a contact list, or populate a database of information sources.
The same structure is used for all contact info sub-elements of
the data_source, rdldoc_source, and formatting_source elements,
so the application that created the document only has to create
one structure for contact info.
|
<!ATTLIST
contact_info
|
|
<!ELEMENT
linkset (link*)> |
A
linkset is a collection of hyperlinks. These hyperlinks
may be either HTML files or RDL files. The individual
link elements hold the actual links and their attributes (below). |
<!ATTLIST
linkset
|
These
attributes designate the HTML or RDL page where a page
of hyperlinks may be found. This is useful where you don't want
to list all of the hyperlinks in the data document itself. |
<!ELEMENT
link (#PCDATA) > |
The
text portion of this element (that is, whatever text appears
between the beginning and ending tag elements) is optional.
If the title and href attributes are filled out they contain
the basic information. If the title attribute is not present,
the text component of the link can be used. |
<!ATTLIST
link
|
Hyperlinks
in RDL follow the XLink standards. |
<!--
LINE_ITEM_SET -->
<!-- Information about the collection of line items --> |
|
<!ELEMENT
line_item_set (data_x, li_class_set?, linkset?,
line_item+) > |
A
line_item_set is a collection of line items. It corresponds
basically to a "table" in a traditional database,
where the line items are rows. There is one data_x element in
a line item set; it corresponds basically to a list of the field
names in a traditional database. |
<!ATTLIST
line_item_set
|
|
<!ELEMENT
data_x (#PCDATA) > |
|
<!ATTLIST
data_x
|
|
<!ELEMENT
li_class_set (li_class+)> |
|
<!ELEMENT
li_class (#PCDATA)> |
|
<!ATTLIST
li_class
|
|
<!--
LINE_ITEM -->
<!-- Information about the Line Item --> |
|
<!ELEMENT
line_item (data_x?, data_y, linkset?, note_set?) > |
|
<!ATTLIST
line_item
|
|
<!ELEMENT
data_y (#PCDATA)> |
|
<!ELEMENT
analysis (linkset?)> |
|
<!ELEMENT
note_set (note+)> |
|
<!ELEMENT
note (#PCDATA)> |
|
<!ATTLIST
note
note_type CDATA #IMPLIED > |
|
XML |
eXtensible
Markup Language ("XML").
|
RDL |
RDL
is a fully compliant implementation of a markup language that
conforms to the XML version 1.0 specification.
|
DTD |
Document
Type Definition. A DTD is a text file which provides a "template"
for the structure of XML documents (of which RDL is a
type). The DTD document specifies the structure of the target
XML document by defining elements and their relationship to
each other.
An element is denoted by "<" and ">"
angle bracket characters. The first word in the angle brackets
of the XML document is the element name. Elements begin
and end with a set of angle brackets. Look for the first one
to have a name and several attributes (e.g., "Color=blue").
The ending tag usually has a "/" character (e.g.,
"</bold>"). In between the element tags there
is usually some form of text. This is the text that shows
up on your screen in an HTML browser.
Top |
encoding="UTF-8" |
Designates
the text encoding.
|
rdldoc_ID |
A
unique identifier for this document. In almost all cases, this
should be the fully qualified filename or URL for this file.
(You can leave off the protocol. That is, the rdldoc_ID can
be "www.e-numerate.com" rather than "http://www.e-numerate.com")
|
doc_title |
The
title that will appear at the top of reports, view windows for
the document, etc. Should be a short (less than 100 characters)
description of the document's data.
|
timestamp |
Generated
by the application that created the RDL document. The
timestamp is in the form YYYY.MMDDHHMMSS. Note that it
can apply to either the time that the document was created or
the time the data was accessed for creation of the document.
|
version |
A
string (less than 255 characters) defined by the publisher of
the document. Version naming policies are up to the creator
of the document. Typical (and suggested) values are of the form
"N.N.N.N".
|
expiration |
The
date and time that the data should no longer be relied on. Generally,
this is the time that the next update is expected to be released.
The expiration stamp is in the form YYYY.MMDDHHMMSS
|
freq_of_update |
Designates
the frequency with which the data is updated. Choices are:
Year, Quarter, Month, Week, Day, Hour, Minute, Second. This
is used by applications which would like to schedule updates to data.
|
num_lineitems |
An
integer describing the number of line items in the attached
line_item_set. Note that this is optional (the receiving
application can, after all, count). It is useful as a checksum, however.
|
num_datapoints |
As
with num_lineitems, this is optional, but useful for checking
to make sure the line_item_set has not been accidentally changed or corrupted.
|
x_indexes |
Numerator
Lite uses this attribute to select the three data fields
to use as representative data fields in the TreePanel reports. "x_indexes"
is a comma-delimited string of three integers, each of which
is an index to a selected field. Note that the indexes key off
the END of the list of fields. So, for example, to show
the last three fields in the tree, use x_indexes = "-3,-2,-1".
Indexes based on the end were chosen because most people reading
a timeseries will want to see the most recent data.
|
first_li_withdata |
An
integer index that identifies the line item that is to be displayed
on the chart when the document is loaded in the RDL Data Viewer.
|
copyright_cite |
This
is the string that will appear on reports, etc. regarding ownership
of the particular data set in the RDL data document. A
typical example would be "Copyright 1999, e-Numerate Solutions
Inc. All Rights Reserved."
|
holder |
Full
legal name of the owner of the copyright. e.g., "RDL,
Inc."
|
license_type |
Typical
license types would be "None - Proprietary and Confidential",
"Public Domain", "Free Use, Rights Reserved by
Owner", "Pay per use", and so forth.
|
warranty |
Most
data preparers will not provide any warranty for their data
sets or the data documents that contain data. Rather,
the warranty item will be a limitation of liability on the part
of the owner. Generally, this attribute will therefore
be "No warranty is provided for this data document."
|
disclaimer |
Most
software and data providers will disclaim any liability for
improper use of the data, or any responsibility for any use
whatsoever. Typical disclaimer: "The provider of this information
makes no representation or warranty of any type; the user accepts
full responsibility for any use of this document."
|
terms |
If
there are any payment terms, length of use, or other terms, this is the place to put the notice.
|
date |
Dates
are strings in the form of "YYYY.MMDD".
|
email |
Full
email address of the copyright owner.
|
state |
State
in the USA. Two letter postal abbreviation.
|
country |
Two
or three letter abbreviation for the country of copyright ownership.
This is important where countries have different copyright laws.
|
role |
What
role the party played in the creation of this document. Current
possibilities are: data_source, rdldoc_source, and formatting_source.
|
name |
Name
of person to contact at the contact organization.
|
company |
Company
or person to contact at the contact organization.
|
address |
Address
of person to contact at the contact organization.
|
city |
City
of person to contact at the contact organization.
|
comments |
Any
particular information about the contact that might be useful to the user.
|
behavior |
Reserved
for XPointer use.
|
content_role |
Reserved
for XPointer use.
|
content_title |
Reserved
for XPointer use.
|
role |
Reserved
for XPointer use.
|
title |
Reserved
for XPointer use. This is the string that appears in the
application as a hyperlink title. For example, in an HTML
browser it will appear as highlighted, underlined text.
|
show |
Reserved
for XPointer use.
|
actuate |
Reserved
for XPointer use.
|
line_item_set_type |
Currently,
the RDL Data Viewer recognizes four different types of
line items: "TimeSeries", "Category", and
"XY". The "type" in this context is
the characterization of the x axis values: do the values in
the line items represent a time series, or a categorization
(sometimes called a crosstabulation), or are they merely an XY scatterplot.
|
time_period |
If
the line items represent a time series, the valid period lengths
are; "Year", "Quarter", "Month",
"Week", "Day", "Hour", "Minute", and "Second".
|
character_set |
Reserved
for future use.
|
missing_values |
Reserved
for future use.
|
null_values |
Reserved
for future use.
|
zero_values |
Reserved
for future use.
|
dates_values |
Reserved
for future use.
|
percentages |
Reserved
for future use.
|
x_title |
As
the data is displayed in a chart, what title is displayed on the x axis.
|
format |
A
string providing a template for the default representation of
the x axis values. The strings are those familiar from spreadsheet programs:
# - digit(s), zeros suppressed
0 - digit(s), zeros displayed
. - decimal point
, - separator
A - z, other characters - displayed literally
Top
|
x_notes |
Any
footnotes regarding the x axis values.
|
x_desc |
Any
description regarding the x axis values.
|
x_prec |
Number
of significant digits for purposes of axis label display. Negative
numbers cause rounding of amounts greater than zero. For
example, a precision of "2" will display a number
as "8,254.43". That same number with a precision
of "-2" will be displayed as "8,300".
The underlying representation of the number will be the full value;
only the formatting and representation on the screen will
change. In the current RDL Data Viewer this is
used primarily for formatting the axis labels.
|
x_unit |
All
numerical quantities have measurement units. "3"
is just a designator for "three of something" unless
you specify what that something is. That is, you could have
"3 cars", or "3 boxes", "3 dollars",
and so forth. The fundamental measurement dimension is
the "unit".
As noted below, the unit by itself may not be sufficient
to define the quantity. You may have "$ in thousands",
or "feet per second", or "Per capita income
($), adjusted for inflation, 1995 = 100". Obviously,
the whole concept of measurement can get complicated.
For RDL, measurements break down into the following
atomic pieces:
units * magnitude (modifier) measure * scale [adjustment]
Example: "$ in thousands per million people (inflation
adjusted)" breaks down as follows:
x_unit = "$" (which may be further qualified as,
say, "US$")
x_mag = "3" (ie., thousands are 10 to the 3rd power)
x_mod = "/" ("per" is the same as dividing)
x_scale = "person" (the singular of "people";
the RDL Data Viewer will convert)
x_measure = "6" (millions are 10 to the 6th power)
x_adjustment = "inflation adjusted, 1995 = 100"
(Any special notes go here)
Obviously, following a standard vocabulary and spelling for
units, measures, etc. is critical.
Top |
x_mag |
The
"magnitude" of the quantity; this is the multiplier
found in the NUMERATOR of a quantity descriptor. For example,
in the descriptor "Yen in Billions", the magnitude
is "9" because a billion is 10 to the 9th power.
Magnitudes are expressed as numeric powers of 10 so that the application
that reads it can make rapid transformations, and also so
that the potential confusion of variant spellings and usages
(million, mille, MM, etc.) is avoided.
Top
|
x_mod |
The
modifier is expressed as a string that is associated with the
division operator. For example, "per" in "$
per capita" means "$ amount / population".
Numerator Lite uses this attribute to contruct y axis
labels and descriptors in reports when the user has made a
transformation to the descriptor and the y_axis_label attribute
is no longer appropriate.
Top
|
x_measure |
The
measure can be thought of as the "units" in the denominator.
Example: "Miles per Hour" is the same as "miles
/ hour", where "miles" is the x_unit, and "hour"
is the measure. Note that measures can be associated with
multipliers just as units can be. Whereas the multiplier in
the numerator is called the "magnitude", in the denominator
it is called the "scale".
Numerator Lite uses this attribute to contruct y axis
labels and descriptors in reports when the user has made a
transformation to the descriptor and the y_axis_label attribute
is no longer appropriate.
Top |
x_scale |
The
scale is the multiplier in the denominator of the descriptor.
It works the same as the magnitude: it is a string that expresses
the power of 10 that should be multiplied by the x_measure.
Numerator Lite uses this attribute to contruct y axis
labels and descriptors in reports when the user has made a
transformation to the descriptor and the y_axis_label attribute
is no longer appropriate.
Top |
x_adjustment |
Any
string that provides a special qualifier to the descriptor.
|
x_links |
This
can be a comma-delimited string of URLs.
|
class_name |
This
is a string of a "class" of data to which this x axis
can belong. This attribute is used in advanced features of RDL
such as macro transformations.
|
parent_class |
A
string designating the parent class. This attribute is
used in advanced features of RDL such as macro transformations.
|
li_ID |
A
unique ID number for the line_item element. All line_item
elements in the line_item_set are numbered from 0 to n (where
n is the number of line_item elements). It must be unique and in order.
|
li_legend |
A
string describing the line item. In the RDL Data Viewer,
the legend appears in the leftmost column of the tree views,
in the chart legend, and in other places where the line item
must be identified in plain language. It does not need to be unique.
|
li_title |
A
string defining the general subject of the line item. In
the RDL Data Viewer, this is used as the title of the
chart, and as titles in reports. Typically, titles are all the
same for line items that are grouped together, but there are
no requirements. The title should merely be selected on
the basis of how clear it will make a chart to the user.
|
li_cat |
Category. Not
currently used in the chart or tree views of the RDL Data
Viewer, this is and internal designator (plain language) of
the subject matter of the line item or group of line items.
|
y_axis_title |
A
string (less than 50 characters) which will appear on the y
axis as the title of that axis. If the user applies a transformation
to any variable in the descriptor, this hard-coded y axis title
will be replaced by one that is generated by the RDL Data Viewer.
|
level |
When
the set of line items is presented in a tree view, it will be
possible to group them in a hierarchical tree, much as file
information is presented in a file directory list. The user
can expand and contract "nodes" of the tree to see
greater or lesser amounts of detail.
This "level" attribute designates the number of
indentations this particular line item should have relative
to the root. So, for example, if a line item has a level
"1", it will appear as a child of the root node.
If it has a "2" level, it will appear as a child
of the most recent "1" level, and so forth.
Top |
relation |
The
relation of this line item to its "parent" node in
a hierarchical tree listing. (The parent node is the most
recent node that is one level about the current line item.) By
default, all line items have a relationship of "ChildStyle"
to their parents, but there are other relation attribute values:
CompPlus, CompMinus, CompTimes, CompDivide and so forth. For
a complete listing and description of these, see the documentation
for your RDL formatting application.
The different relation attributes are designed to allow the data
publisher to designate different icons that appear to the
left of the line item in the TreeView. These icons give
visual clues to the user as to how each line item relates to its parent.
Top
|
li_notes |
Any
string may be placed here to show footnotes. Generally,
if the source of this line item is different from the overall
data_source of the document, you will want to note that here.
|
li_desc |
Any
string that provides additional description regarding the line
item. These descriptions tend to be less formal than the footnotes,
as they appear in fewer reports.
|
li_prec |
Number
of significant digits for purposes of axis label display. Negative
numbers cause rounding of amounts greater than zero. For
example, a precision of "2" will display a number
as "8,254.43". That same number with a precision of
"-2" will be displayed as "8,300".
The underlying representation of the number will be the full value;
only the formatting and representation on the screen will
change. In the current RDL Data Viewer this is used
primarily for formatting the axis labels.
Top
|
li_unit |
All
numerical quantities have measurement units. "3"
is just a designator for "three of something" unless
you specify what that something is. That is, you could
have "3 cars", or "3 boxes", "3 dollars",
and so forth. The fundamental measurement dimension is the "unit".
As noted below, the unit by itself may not be sufficient to define
the quantity. You may have "$ in thousands",
or "feet per second", or "Per capita income
($), adjusted for inflation, 1995 = 100". Obviously,
the whole concept of measurement can get complicated.
For RDL, measurements break down into the following atomic pieces:
units * magnitude (modifier) measure * scale [adjustment]
Example:
"$ in thousands per million people (inflation adjusted)"
breaks down as follows:
x_unit
= "$" (which may be further qualified as, say, "US$")
x_mag = "3" (ie., thousands are 10 to the 3rd power)
x_mod = "/" ("per" is the same as dividing)
x_scale = "person" (the singular of "people";
the RDL Data Viewer will convert)
x_measure = "6" (millions are 10 to the 6th power)
x_adjustment = "inflation adjusted, 1995 = 100"
(Any special notes go here)
Obviously, following a standard vocabulary and spelling for units, measures,
etc. is critical.
Top
|
li_mag |
The
"magnitude" of the quantity; this is the multiplier
found in the NUMERATOR of a quantity descriptor. For example,
in the descriptor "Yen in Billions", the magnitude
is "9" because a billion is 10 to the 9th power.
Magnitudes are expressed as numeric powers of 10 so that the application
that reads it can make rapid transformations, and also so
that the potential confusion of variant spellings and usages
(million, mille, MM, etc.) is avoided.
Top
|
li_mod |
The
modifier is expressed as a string that is associated with the
division operator. For example, "per" in "$
per capita" means "$ amount / population".
Numerator Lite uses this attribute to contruct y axis labels and
descriptors in reports when the user has made a transformation
to the descriptor and the y_axis_label attribute is no longer appropriate.
Top
|
li_measure |
The
measure can be thought of as the "units" in the denominator.
Example: "Miles per Hour" is the same as "miles
/ hour", where "miles" is the x_unit, and "hour"
is the measure. Note that measures can be associated with
multipliers just as units can be. Whereas the multiplier
in the numerator is called the "magnitude", in the
denominator it is called the "scale".
Numerator Lite uses this attribute to contruct y axis labels and
descriptors in reports when the user has made a transformation
to the descriptor and the y_axis_label attribute is no longer appropriate.
Top
|
li_scale |
The
scale is the multiplier in the denominator of the descriptor.
It works the same as the magnitude: it is a string that expresses
the power of 10 that should be multiplied by the li_measure.
Numerator Lite uses this attribute to contruct y axis labels and
descriptors in reports when the user has made a transformation
to the descriptor and the y_axis_label attribute is no longer appropriate.
Top
|
li_adjustment |
Any
string that provides a special qualifier to the descriptor.
Top
|
li_aggregation |
Occasionally,
the user will want to "aggregate" or "deaggregate"
data based on differing x axis transformations. This attribute
explains to the RDL Data Viewer how to handle this particular
line item when such transformations are being attempted.
Example: A line_item_set presents bank account information; each line
item is a time series and presents quarterly data, and the
user may wish to see the data on an annual basis. For
some line items, that is a matter of simply summing up four
quarters worth of data (e.g., deposits), but for other other
line items (e.g., closing balance), you only want to show
the last quarter's value.
Currently accepted values are: "sum", "average",
"minimum", "maximum", "first",
"last", "none".
Top
|
xlink:form |
Under
the XLink specification, hyperlinks may be "simple"
links or "extended" links. Simple hyperlinks
are the familiar "jump" links of HTML browsers: clicking
on that link will close the current page and open the target page.
Top |
href |
The
standard string for a URL ( i.e. http://www.e-numerate.com).
Top
|
"simple"
links |
Traditional
"jump" hyperlinks. Clicking on this list in the browser
window will close the current page and open the target page.
Top |
"extended"
links |
Reserved
for future use.
Top
|