Data Dictionary
Last Modified:
7/19/99 11:59 AM
Contents:
1. Model DTD and tag
description.
Note: the model below assumes that dictionary tags
are defined elsewhere. Variables are referred to by name.
<!ELEMENT data-dictionary ((categorical | ordinal | continuous)+)>
<!ELEMENT categorical (category+)> <!ATTLIST categorical name CDATA #REQUIRED >
<!ELEMENT category EMPTY> <!ATTLIST category value CDATA #REQUIRED display-value CDATA #IMPLIED proportion CDATA #IMPLIED missing (true | false) "false" >
<!ELEMENT ordinal (order+)> <!ATTLIST ordinal name CDATA #REQUIRED >
<!ELEMENT order EMPTY> <!ATTLIST order value CDATA #REQUIRED display-value CDATA #IMPLIED rank CDATA #REQUIRED proportion CDATA #IMPLIED missing (true | false) "false" >
<!-- The predicates indicate the values that represent missing values --> <!ELEMENT continuous ((%predicates;)*)> <!ATTLIST continuous name CDATA #REQUIRED minimum CDATA #IMPLIED maximum CDATA #IMPLIED mean CDATA #IMPLIED median CDATA #IMPLIED standard-deviation CDATA #IMPLIED inter-quartile-range CDATA #IMPLIED >
data-dictionary - marks the beginning of the container whose
contents define the complete set of fields referenced in any model in the file.
Any field referenced any place in the PMML file must be declared
here.
categorical - one of the three data types that a field
may have.
category - one of the values that a categorical field may take.
When exporting a PMML file, an entry should be made for each category that was
found in the training data. If this value represents missing data, the missing
attribute should be set to true.
ordinal - one of the three data types that
a field may have.
order - one of the values that an ordinal
field may take. When exporting a PMML file, an entry should be made for each
ordinal value that was found in the training data. If this value represents
missing data, the missing attribute should be set to true.
continuous
- one of the three data types that a field may have. To indicate which
continuous values represent missing values, construct a series of
predicates.