Cover Pages: SGML/XML: Elements versus attributes, April 1998

Elements versus Attributes

XML-DEV Discussion - 1998 and Following...

In April 1998, several messages were posted to XML-DEV relating to principles that might be used to decide whether an XML encoding should (best) use elements or attributes. Some of these are collected below, together with a few subsequent posts. See also: "When Should I Use Elements, and When Should I Use Attributes?" and the database section "Elements versus attributes - How Do I Decide?"

     From owner-xml-dev@ic.ac.uk Mon Apr  6 18:00:54 1998
Date:     Mon, 6 Apr 1998 15:51:49 -0700 (PDT)
From:     Roy Tennant <rtennant@library.berkeley.edu>
To:       xml-dev@ic.ac.uk
Subject:  When is an attribute an attribute?

I've been trying to figure this out for a while with no success. It seems
to me that there are several quite different ways one can encode
information in XML. Are all of the following correct? When and why would
you choose one over another? Does it matter? Thank you for your indulgence
as I puzzle out what must surely be readily apparent to most of you.

Example 1:
---------

<BOOK TITLE="The Call of the Wild" AUTHOR="London, Jack"\>

Example 2:
---------

<BOOK AUTHOR="London, Jack">The Call of the Wild</BOOK>

Example 3:
---------

<BOOK>
   <TITLE>The Call of the Wild</TITLE>
   <AUTHOR>London, Jack</AUTHOR>
</BOOK>

Thanks,
Roy Tennant

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

From owner-xml-dev@ic.ac.uk Mon Apr  6 19:47:41 1998
Date: Mon, 06 Apr 1998 19:35:04 -0500
From: len bullard <cbullard@hiwaay.net>
Subject: Re: When is an attribute an attribute?
Sender: owner-xml-dev@ic.ac.uk
To: Roy Tennant <rtennant@library.berkeley.edu>
Cc: xml-dev@ic.ac.uk
Reply-to: len bullard <cbullard@hiwaay.net>
Message-id: <352974B8.2964@hiwaay.net>
Organization: Blind Dillo
MIME-version: 1.0
X-Mailer: Mozilla 3.01 (Win95; I)
Content-type: text/plain; charset=us-ascii
Content-transfer-encoding: 7bit
Precedence: bulk
References: <Pine.OSF.3.96.980406154414.5362G-100000@library.berkeley.edu>
Status: RO

Roy Tennant wrote:
> 
> I've been trying to figure this out for a while with no success. It seems
> to me that there are several quite different ways one can encode
> information in XML. Are all of the following correct? 

Yes.

> When and why would
> you choose one over another? Does it matter? Thank you for your indulgence
> as I puzzle out what must surely be readily apparent to most of you.

Ok, a DTD really helps this sort of discussion along, but 
FWIW:

> Example 1:
> ---------
> 
> <BOOK TITLE="The Call of the Wild" AUTHOR="London, Jack"\>

Use empty elements and attributes for tag bags, basically, 
if the datum has no frequency and order requirements (only 
occurs once somewhere in the attribute list). 

NOTE:  I haven't looked to see if XML dropped the 
SGML restriction on repeated values in attlist 
decls.

> Example 2:
> ---------
> 
> <BOOK AUTHOR="London, Jack">The Call of the Wild</BOOK>

Use this if you don't care that the string inside the 
tags is only differentiated by the BOOK, that is, 
semantically, there is no difference between this 
and 

<BOOK AUTHOR="London, Jack">Love that Wolf!!</BOOK>

or IOW, your application has to know that is a title.

> Example 3:
> ---------
> 
> <BOOK>
>    <TITLE>The Call of the Wild</TITLE>
>    <AUTHOR>London, Jack</AUTHOR>
> </BOOK>

Use this when it is important to know there 
is a title and author (i.e, this BOOK 
HAS-A TITLE, HAS-A AUTHOR; the 
string, The CALL of the WILD IS-A TITLE).  
Given the element type declaration, you can tell which order 
they should come in, are there multiple 
authors, are there alternate titles, etc. 
The semantic is application dependent.  For 
a linking semantic, you might be counting 
nodes inside the BOOK.  For rendering, 
you might be assigning the font value 
based on the context of the book element.

len

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

From owner-xml-dev@ic.ac.uk Mon Apr  6 19:57:15 1998
From: Jim Amsden <jamsden@us.ibm.com>
To: <xml-dev@ic.ac.uk>
Subject: Re: When is an attribute an attribute?
Message-ID: <5040100016970517000002L072*@MHS>
Date: Mon, 6 Apr 1998 20:50:31 -0400
MIME-Version: 1.0
Content-Type: text/plain
Sender: owner-xml-dev@ic.ac.uk
Precedence: bulk
Reply-To: Jim Amsden <jamsden@us.ibm.com>
Status: RO

I think it's best to treat this as an object modeling problem first, and then
an XML representation. The distinction between attribute and content element
then becomes the distinction between an attribute and a containment
relationship with another object. Object attributes are atomic, referentially
transparant characteristics of an object that have no identity of their own.
Generally this corresponds to primitive data types, but this can be somewhat
arbitrary too (e.g., Strings, Date, etc.). Taking a more logical view, an
attribute names some characteristic of an object that models part of its
internal state, and is not considered an object in its own right. That is, no
other objects have relationships to an attribute of an object, but rather to
the object itself.

So if the thing you want to capture has internal structure of its own, or can
be referenced through a link, or can be contained in more than one element,
then its an element, otherwise it's probably an attribute. Note that attributes
have a numer of advantages over content elements:

1. they can have names that indicate the role the value plays in the element.
Element contents have content names, but there is no way to say what role the
content plays in any particular element that contains it.

2. attributes can have default values.

3. attributes have (minimal) data types

4. attributes take up less space as there is no need for an end tag

5. attributes are easier to access in DOM.

There are also some disadvantages:

1. attributes aren't as convenient for large values, or binary entities.

2. values containing quotes can be a bother.

3. attributes can't contain other elements. This isn't really a disadvantage,
but part of what it means to be atomic.

4. white space can't be ignored in an attribute.

My recommendation is to use attributes unless you can't, and certainly use them
to avoid mixed data content in elements whenever possible. The idea is to
encapsulate as much as you can in an individual object but not too much. Use
the principles of data normalization, they work fine here too.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

From owner-xml-dev@ic.ac.uk Mon Apr  6 20:21:04 1998
Date: Tue, 07 Apr 1998 11:12:38 +1000
From: Rick Jelliffe <ricko@allette.com.au>
Subject: Re: When is an attribute an attribute?
Sender: owner-xml-dev@ic.ac.uk
To: xml-dev@ic.ac.uk
Reply-to: Rick Jelliffe <ricko@allette.com.au>
Message-id: <005101bd61c2$7e02bbe0$a30b4ccb@NT.JELLIFFE.COM.AU>

From: Jim Amsden <jamsden@us.ibm.com>

> I think it's best to treat this as an object modeling problem first, and
then
> an XML representation

Without going against object modeling or any other view, you should first
be aware of any constraints in XML (Len's comment) and in your immediate
software.

If your editing software makes attributes easy, then use attributes. If your
rendering/draft software does not support attributes well, use elements.
You can do a simple translation of your DTD and document to convert
from one form to another anyway. A DTD does not need to be set in stone.

A markup language has to reconcile data modeling needs and human factors,
with the latter being the most important.

Rick Jelliffe

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

From owner-xml-dev@ic.ac.uk Mon Apr  6 22:29:52 1998

Date: Mon, 6 Apr 1998 20:20:26 -0700
Message-Id: <199804070320.UAA00399@unready.microstar.com>
From: David Megginson <ak117@freenet.carleton.ca>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
To: xml-dev@ic.ac.uk
Subject: When is an attribute an attribute?
In-Reply-To: <Pine.OSF.3.96.980406154414.5362G-100000@library.berkeley.edu>
References: <Pine.OSF.3.96.980406154414.5362G-100000@library.berkeley.edu>

Roy Tennant writes:

 > I've been trying to figure this out for a while with no success. It
 > seems to me that there are several quite different ways one can
 > encode information in XML. Are all of the following correct? When
 > and why would you choose one over another? Does it matter? Thank
 > you for your indulgence as I puzzle out what must surely be readily
 > apparent to most of you.

It's not self-evident, and everyone has their own strongly-held
opinions.  Database people are tempted to force everything into
attributes, because attributes are (slightly) typed while character
data is not.  Generally, though, you need to consider the following:

- attribute values are harder to search for in search engines
- attribute values often don't appear on the screen in editing tools
  (you have to open a special dialog or popup to see them)
- attribute values can have no substructure
- attribute values can be slightly more awkward to access in
  processing APIs
- attributes are unordered, so there is no standard way to specify
  that one attribute's value should precede the other's (there is no
  guarantee that an API will give you the attributes in the same order
  that you specified them)

My rule is to use attributes in markup just as I would use footnotes
or endnotes in a book -- to provide extra information that is not part
of the main content, but that is useful to know about it.  By this
rule, all of your examples are correct, but under different
circumstances.

 > Example 1:
 > ---------
 > 
 > <BOOK TITLE="The Call of the Wild" AUTHOR="London, Jack"\>

In this case, all that really matters is that there's a book there.
An XML document author might see <BOOK> in the main editing window,
but get the attribute values in a pop-up only by clicking the mouse.
It's not essential to know the book's title or author, and it is
unlikely that anyone would want to search for it.

Yes: insurance company list of property to be replaced; customs list of
     objects declared at border
No:  online bookstore; library catalogue

 > Example 2:
 > ---------
 > 
 > <BOOK AUTHOR="London, Jack">The Call of the Wild</BOOK>

In this one, the title matters but the author is just extra
information.  You'd probably use this for encoding a title inline,
where the title will be printed as part of the paragraph (possibly in
italics), but the author's name would appear only in a separate index
or popup.

  <PARA>I enjoyed the book <BOOK AUTHOR="London, Jack">The Call of the
  Wild</BOOK>.</PARA>

 > Example 3:
 > ---------
 > 
 > <BOOK>
 >    <TITLE>The Call of the Wild</TITLE>
 >    <AUTHOR>London, Jack</AUTHOR>
 > </BOOK>

In this one, both the title and author are important -- you'd use this
for the citation line of a quotation, in a bibliography, at an online
bookstore, or in a library catalogue.

I hope this helps.

All the best,

David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

From owner-xml-dev@ic.ac.uk Mon Apr  6 18:22:19 1998
Date: Mon, 06 Apr 1998 16:07:23 -0700
From: "Daniel B. Austin" <daniela@cnet.com>
Subject: Re: When is an attribute an attribute?
In-reply-to: <Pine.OSF.3.96.980406154414.5362G-100000@library.berkeley.e du>
Sender: owner-xml-dev@ic.ac.uk
X-Sender: daniela@cnet5.cnet.com
To: Roy Tennant <rtennant@library.berkeley.edu>
Cc: xml-dev@ic.ac.uk
Reply-to: "Daniel B. Austin" <daniela@cnet.com>
Message-id: <199804062312.QAA10316@central.cnet.com>

Hi,

All three of your examples below are well-formed. The decision as to whether
properties of document objects are to be encoded as attributes or as
element content
is up to you; there is no clear cut answer. (You might note that your
example #2 below
doe not provide as much information as examples #1 & 3, because it does not
specify 
that the <BOOK> element's content is a title...it could be anything.)
	Here are some considerations that may inform your decisions regarding
attributes
and elements:

a) does the document property relate to the structure of the document? If
yes then
an element would provide better use.
b) are your target documents going to be large in terms of file size? If
so, an attribute might be a better choice.
c) is the processor/display device you are using better or faster at
parsing one or the other?
d) does the property apply to many elements in your document? ie. in
book.xml the title
might only show up once, or once at the bottom of each page. 
e) Does the author find it easier to add an element or an attribute or does
it matter?

In general I would make the case that properties that are used often and
are non-structural
in nature would be best defined as attributes and others as elements.

Regards,

D-

At 03:51 PM 4/6/98 -0700, you wrote:
>I've been trying to figure this out for a while with no success. It seems
>to me that there are several quite different ways one can encode
>information in XML. Are all of the following correct? When and why would
>you choose one over another? Does it matter? Thank you for your indulgence
>as I puzzle out what must surely be readily apparent to most of you.
>
>Example 1:
>---------
>
><BOOK TITLE="The Call of the Wild" AUTHOR="London, Jack"\>
>
>Example 2:
>---------
>
><BOOK AUTHOR="London, Jack">The Call of the Wild</BOOK>
>
>Example 3:
>---------
>
><BOOK>
>   <TITLE>The Call of the Wild</TITLE>
>   <AUTHOR>London, Jack</AUTHOR>
></BOOK>
>
>Thanks,
>Roy Tennant

Daniel Austin 	daniela@cnet.com
Director of Development, Corporate Creative Services 
CNET: The Computer Network (415) 395-7800 x1438
"To change the old into the new, and the shapes of things to come..."

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

From owner-xml-dev@ic.ac.uk Mon Apr  6 18:24:59 1998
Date: Mon, 6 Apr 1998 19:12:09 -0400
To: Roy Tennant <rtennant@library.berkeley.edu>
From: Murray Maloney <murray@muzmo.com>
Subject: Re: When is an attribute an attribute?
Cc: xml-dev@ic.ac.uk
In-Reply-To: <Pine.OSF.3.96.980406154414.5362G-100000@library.berkeley.e
 du>

There is no real "right" way to encode something 
like your example. Any of the examples that you
have offered is just as likely as the other, and
an application that works with any of them is just
as likely to succeed in meeting its objectives.

However, if you wanted to distinguish between a 
family and given name, and maybe add an honorific
or an accreditation, you might want to use an element
with subelements for the author. Using a comma in
the name requires a second-level parse. An advantage
of using nested subelements is that you can avoid 
a second level parse.

Otherwise, as I said, there is no "right" answer.

At 06:51 PM 4/6/98 -0400, Roy Tennant wrote:
>I've been trying to figure this out for a while with no success. It seems
>to me that there are several quite different ways one can encode
>information in XML. Are all of the following correct? When and why would
>you choose one over another? Does it matter? Thank you for your indulgence
>as I puzzle out what must surely be readily apparent to most of you.
>
>Example 1:
>---------
>
><BOOK TITLE="The Call of the Wild" AUTHOR="London, Jack"\>
>
>Example 2:
>---------
>
><BOOK AUTHOR="London, Jack">The Call of the Wild</BOOK>
>
>Example 3:
>---------
>
><BOOK>
>   <TITLE>The Call of the Wild</TITLE>
>   <AUTHOR>London, Jack</AUTHOR>
></BOOK>
>
>Thanks,
>Roy Tennant
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Murray Maloney                      Email: murray@muzmo.com
Technical Director                  Phone: (905) 509-9120
Veo Systems 				Fax:   (905) 509-8637
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
		Make a Tax-Deductible Donation  	
		Yuri Rubinsky Insight Foundation
		http://www.yuri.org/donate.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

From owner-xml-dev@ic.ac.uk Tue Apr  7 09:28:42 1998
Message-ID: <352A367E.B4FCEF1C@infinet.com>
Date: Tue, 07 Apr 1998 10:21:50 -0400
From: Tyler Baker <tyler@infinet.com>
X-Mailer: Mozilla 4.04 [en] (WinNT; U)
MIME-Version: 1.0
To: xml-dev@ic.ac.uk
Subject: Re: When is an attribute an attribute?
References: <3.0.1.32.19980406191209.0070f5b0@pop.uunet.ca>

I asked the same question about 4 months ago concerning using attributes vs.
elements on this list and got some interesting answers.  In that time I have found
that for modeling objects a few principles come to mind.

If you are modeling an object which will never change at all (like a Rectangle)
then you would be best to do something like this:

<RECTANGLE x="0" y="0" width="0" height="0"/>

The rationale for this approach over using elements is that in most XML processors
you will get all of the attribute values at once that are necessary for generally
immutable objects like Rectangle's.  In a particular application of mine I found
that I would call setBounds() in java.awt.Component 4 times using the element
approach vs. only once with the attributes approach.

If you are representing something whose type may evolve over time like a user
profile in a database, then the element approach I feel works better in the long
run...

Tyler

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

From owner-xml-dev@ic.ac.uk Tue Apr  7 12:41:56 1998
Date: Tue, 07 Apr 1998 10:35:38 -0700
From: Rich Koehler <RKoehler@able-inc.com>
Subject: RE: When is an attribute an attribute?
Sender: owner-xml-dev@ic.ac.uk
To: xml-dev@ic.ac.uk
Reply-to: Rich Koehler <RKoehler@able-inc.com>
Message-id: <30511AC98761D111976F0060082DE6E90A3265@cascade.able-inc.com>
MIME-version: 1.0
X-Mailer: Internet Mail Service (5.0.1458.49)

I've become fond of the method that Tim Bray used to distinguish between
elements and attributes in his discussion of MCF
(http://www.textuality.com/mcf/MCF-tutorial.html).  He writes, "...when
the property has a simple value like a string, we put that in the
content of the element; when the property's value is another object, we
put a pointer to it in an attribute value and leave the element
decribing the property empty."

This allows the creation of a directed linked graph, where objects refer
to other objects, and the links can have attributes of their own.  In
your case it might look like this:

<BOOK ID="The Call of the Wild">
   <AUTHOR UNIT="Jack London"/>
</BOOK>

Which allows you to define something like this:

<PERSON ID="Jack London">
   <FIRST>Jack</FIRST>
   <LAST>London</LAST>
   <PHONE>(206) 555-3423</PHONE>
   <WORK UNIT="The Call of the Wild"/>
   <WORK UNIT="Love those Wolves"/>
</PERSON>

Where the ID attributes are unique tokens for each object, and the UNIT
attributes point to other objects.  In this case we see that Jack London
is a PERSON, who in the context of the book "The Call of the Wild" is an
AUTHOR.  Jack may appear in other objects, in other contexts, like:

<STORE ID="Wal-Mart">
   <CUSTOMER UNIT="Jack London"/>
....

I think RDF will eventually address this.  Anyway, that's my personal
preference.

Rich

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

From owner-xml-dev@ic.ac.uk Tue Apr  7 23:48:44 1998
From: "Rick Jelliffe" <ricko@allette.com.au>
To: <xml-dev@ic.ac.uk>
Subject: Re: When is an attribute an attribute?
Date: Wed, 8 Apr 1998 14:41:36 +1000

-----Original Message-----
From: len bullard <cbullard@hiwaay.net>

> The funniest thing I've seen lately is a statement
> on the Microsoft XML site that XML gets rid of
> committees who design DTDs in favor of a
> more "organic" approach.  Lots of luck. ;-)

:-)

My book <plug type="shameless">The XML & SGML Cookbook</plug>,
due out next month, looks at this issue. In particular it gives some basic
patterns
and considerations that can be used for "rapid prototyping" a document type.

 Most document types require some rethought after deployment. Very few
people
actually have much of an idea of what their data contains. Anyway, when you
start
actually using markup systems you will want to make maximal use of the
particular
tools you have bought. So even if a DTD was created without any
consideration
of the software to be used, there is often good reason to enhance the DTD to
make best use of the particular capabilities of the appliciations (and to
overcome
flaws that turn up).

DTDs made by committees often tend to be rather kitchen-sinkish. But this is
better
dealt with by dividing them into separate DTDs (especially for front and
backmatter),
which are more manageable, or by introducing "training-wheel" DTDs which
won't
scare people off, rather than by saying they are over-engineered.

Documents and publications are much more
complicated than people want to accept: sometimes the only way is for people
to learn by being given a simple DTD and then having issues in their
documents
prove to them that a larger DTD is actually what they require.

"Organic" is an attractive word. Being able to make ad hoc changes to DTDs
is
great if you are processing them, or if you have a family of documents which
are
similar but not exactly the same type. SGML systems have suffered in the
past
because DTD-alterations was often a large-scale exercise for gurus. XML is
doing good things in making this more difficult.

But the idea that XML markup declarations are inherently inflexible, while
declaration-less XML allows more "organic" development is spurious.

One trick SGML people use (this is adapted from Travis and Waldt's book) is
to
make explicit element types for unaccounted-for elements. This gives you
somewhere to park important data in the absense of DTD elements.
This kind of flexibility is available in any DTD: you don't need to abandon
XML markup declarations to get it. For example, the following declaration
is a good basis for such an element type:

<!ELEMENT new ANY >
<!-- "class" is the name the user might suggest for this element type
       if in a DTD.  "HTMLform" is the nearest HTML element type, to
       help rendering. -->
<!ATTLIST    new
    id        ID        #IMPLIED
    class  CDATA #REQUIRED
    HTMLform  CDATA #IMPLIED
    comment    CDATA  #IMPLIED>
...
<new class="dog" HTMLform="em">Rover</new>

(Check out the HTML span and div elements too.)

Rick Jelliffe

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

From owner-xml-dev@ic.ac.uk Tue Apr  7 10:44:21 1998
Date: Tue, 07 Apr 1998 11:24:39 -0400
From: Murray Maloney <murray@muzmo.com>
Subject: Re: When is an attribute an attribute?

At 10:21 AM 4/7/98 -0400, Tyler Baker wrote:
>If you are modeling an object which will never change at all 
>(like a Rectangle) then you would be best to do something like this:
>
><RECTANGLE x="0" y="0" width="0" height="0"/>
>
This is a very good example of when attributes are optimal.
In this case, the attributes are object properties,
rather than children of the object.

Even so, a RECTANGLE element could use containment
to better advantage for cases where there are many, 
possibly disjoint name/value pairs or collections.

<RECTANGLE>
	<ORIGIN><X>0</X><Y>0</Y></ORIGIN>
	<SIZE><DX>7in</DX><DY>9in</DY></SIZE>
	<LABEL>My Pretty Rectangle</LABEL>
	<IMAGE>floral.jpeg</IMAGE>
	<BACKGROUND>gold</BACKGROUND>
	<FOREGROUND>blue</FOREGROUND>
	<FORM><SUBMIT/></FORM>
	<ETC>...</ETC>
</RECTANGLE>

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Murray Maloney                      Email: murray@muzmo.com
Technical Director                  Phone: (905) 509-9120
Veo Systems 				Fax:   (905) 509-8637
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
		Make a Tax-Deductible Donation  	
		Yuri Rubinsky Insight Foundation
		http://www.yuri.org/donate.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

From owner-xml-dev@ic.ac.uk Tue Apr  7 18:56:26 1998
Message-ID: <352AB962.7CF4@hiwaay.net>
Date: Tue, 07 Apr 1998 18:40:18 -0500
From: len bullard <cbullard@hiwaay.net>
Organization: Blind Dillo
X-Mailer: Mozilla 3.01 (Win95; I)
MIME-Version: 1.0
To: Rich Koehler <RKoehler@able-inc.com>
CC: xml-dev@ic.ac.uk
Subject: Re: When is an attribute an attribute?
References: <30511AC98761D111976F0060082DE6E90A3265@cascade.able-inc.com>

Rich Koehler wrote:
> 
> I've become fond of the method that Tim Bray used to distinguish between
> elements and attributes in his discussion of MCF
> (http://www.textuality.com/mcf/MCF-tutorial.html).  He writes, "...when
> the property has a simple value like a string, we put that in the
> content of the element; when the property's value is another object, we
> put a pointer to it in an attribute value and leave the element
> decribing the property empty."

Neat!  As others have pointed out, much depends not 
on the abstraction of the modeling technique, but on 
the method to be applied to the markup (ie, the application).  
If I want a tracking system for the person, 
the pointer techniques are good.  If 
I want to render a title or find all titles, then the 
explicit element declaration is good.

BTW:  All of this is why DTDs have worked well 
for so many years.  They are a contract between 
implementors and systems.

The funniest thing I've seen lately is a statement 
on the Microsoft XML site that XML gets rid of 
committees who design DTDs in favor of a 
more "organic" approach.  Lots of luck. ;-)

len

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

Date: Mon, 24 Aug 1998 22:26:53 -0500
From: Paul Prescod <papresco@technologist.com>
To: xml-dev@ic.ac.uk

Samuel R. Blackburn wrote:
> 
>  My rule of thumb is attributes contains data that is unique to that
>  particular object.

Do I understand you correctly if I parapharse your post as "attributes are
only useful for unique identifiers?" That's as good a "rule of thumb" as
any other, I guess, but it is a much more radical one than any I have
heard before. It seems just as arbitrary as any other rule of thumb I've
heard.

 Paul Prescod  - http://itrc.uwaterloo.ca/~papresco

Date:    Mon, 24 Aug 1998 18:05:03 -0500 (CDT)
From:    Robin Cover <robin@acadcomp.sil.org>
To:      xml-dev@ic.ac.uk
Subject: Re: Newbie Q

Frank Blau <fblau@nina.snohomish.wa.gov> wrote:

> Is there a formal rule for the use of Atrributes vs Elements?

Some hints are recorded in the documents with these URLs:

   http://xml.coverpages.org/elementAttr9804.html
   http://xml.coverpages.org/elementsAndAttrs.html

These documents are referenced from the dedicated section:

 http://xml.coverpages.org/topics.html#elementsAndAttrs
   Elements versus attributes - How do I decide?

> that Attributes are
> best used to communicate information to the browser/application, and
> Elements are best used for actual Data. Is this a valid assumption?

I do not think this assumption has any basis whatever in the XML 1.0
specification, and it certainly has no basis in the parent standard,
ISO 8879.  There is some basis in HTML browser behavior, but that is
(in my opinion) a Bad Thing, and not to be perpetuated as a standard
agreement.  It is dangerous for all the same reasons as in "HTML":
the industry got stuck with hard-coded application processing semantics.
XML encoding itself should used with the semantic opacity that the
specification implies, in my judgment; styles and other (separate)
processing specifications should determine how/whether certain (character)
data in an XML document is acted upon (displayed, suppressed, etc.).

The distinction between "information [for the browser/application]"
and "actual Data" is specious, at least from the perspective of
XML itself.  Why?  because what you think is your "metadata" (today)
will become your data tomorrow; your metadata is someone else's data
even today.  "Metadata" does not map conceptually to attribute - for
many reasons, the most obvious of which is that some metadata is highly
complex, and cannot be structured using SGML/XML attributes (flat
strings).  And, of course, we know that XML is now being designed
for use in all kinds of ('Internet') applications which do not involve
any "browser" or human viewing of the data in its tagged format;
DTDs are generated from database schemas, and tagged data is passed
between applications (then discarded) without being seen by humans.

My 2 cents.  The documents referenced above contain excellent
summaries of considerations that might be taken into account when
deciding how to model your data in a markup representation.

-robin

From:   "James Tauber" <jtauber@jtauber.com>
To:     "Frank Blau" <fblau@nina.snohomish.wa.gov>, <xml-dev@ic.ac.uk>
Subject: Re: Newbie Q
Date:   Tue, 25 Aug 1998 18:22:37 +0800

>Is there a formal rule for the use of Atrributes vs Elements? The
>assumption I am going on (per The XML Primer) is that Attributes are
>best used to communicate information to the browser/application, and
>Elements are best used for actual Data. Is this a valid assumption?
>
>In an EDI transaction, I was going to put the Header and Trailer
>information in attributes, with the actual Detail Segments as
>Elements...
>
>Any thoughts?

Definitely see Robin Cover's page on this [cited by another responder].

It seems to me that issue of Attributes vs Elements becomes trickier as the
thing you are marking up becomes more data-like and less document-like.

The value of attributes are technically markup rather than content (at least
by my reading of the spec) so the clearer the distinction is between what
should be content and what should be markup, the clearer the attribute vs
element issue is.

This isn't too bad when you are marking up already existing content but it
gets progressively worse as the markup language is used less and less for
'marking up' and more and more for other things.

As my paper at SGML/XML Asia Pacific (or is it XML Asia now?) will discuss,
one is always drawing the line between content and markup on an application
by application basis. Content generally contains (or perhaps *is*) markup
that's not XML. Consider spaces between words. They are a form of (non-XML)
markup. They are also a presentational style. In some applications (such as
corpus linguistics) word boundaries are marked up and it is a stylesheet
issue to display the spaces.
The moral is that even an important distinction like markup vs content vs
presentation depends on the application.

James

From owner-xml-dev@ic.ac.uk Tue Aug 25 12:49:07 1998
From: Dean Roddey <roddey@us.ibm.com>
To: <xml-dev@ic.ac.uk>
Subject: Re: Newbie Q
Date: Tue, 25 Aug 1998 13:47:46 -0400

> Is there a formal rule for the use of Atrributes vs Elements? The
> assumption I am going on (per The XML Primer) is that Attributes are
> best used to communicate information to the browser/application, and
> Elements are best used for actual Data. Is this a valid assumption?
> 
> In an EDI transaction, I was going to put the Header and Trailer
> information in attributes, with the actual Detail Segments as
> Elements...
> 
> Any thoughts?

There are also practical issues I guess. If you need to validate the
document with a DTD, there is much more control available to control the
content of subelements. You can say that the parent element's content
model allows this or that, or this, that or the other,
etc...

With attributes, there is less flexibility. Each attribute is either
there or not. There is no way to indicate that if you provide this
one, you can't provide that one, or if these two are present, then
you can't have that one, or if this one is present, then that
one has to also be present, and so on (at least there is no way
that I know of :-)

Beyond that I think its purely an issue of what works best for what
you want to do. And there is sometimes an issue of readability and
writeability if the documents are ever dealt with by actual humans
<gasp> :-) Attributes with large values are kind of funky
(IMHO) to make very readable, so I'd make any property that could
have large values an element if all other things were equal (which
of course they often aren't.)

Overall, I guess the best rule of thumb is that attributes should hold
stuff that either you need to get to fast without wanting to iterate
the sub elements, or which provide 'control information' as you
indicated, or which need to have ID/IDREF semantics enforced, or
that you might want to provide implicit defaults for, and probably some
others that I can't think of :-)

Dean Roddey

From     owner-xml-dev@ic.ac.uk Tue Aug 25 23:22:02 1998
Date:    Wed, 26 Aug 1998 10:12:51 +0700
From:    James Clark <jjc@jclark.com>
To:      xml-dev@ic.ac.uk
Subject: Re: Newbie Q

Dean Roddey wrote:

>>I do not think this assumption has any basis whatever in the XML 1.0
>>specification, and it certainly has no basis in the parent standard,
>>ISO 8879.  There is some basis in HTML browser behavior, but that is
>>(in my opinion) a Bad Thing, and not to be perpetuated as a standard
>>agreement.  It is dangerous for all the same reasons as in "HTML":
>>the industry got stuck with hard-coded application processing semantics.
>>XML encoding itself should used with the semantic opacity that the
>>specification implies, in my judgment; styles and other (separate)
>>processing specifications should determine how/whether certain
>>(character) data in an XML document is acted upon (displayed, suppressed,
>> etc.).

> So does anyone have any opinions on whether something like XSL will
> be more convenient to deal with attributes than elements?  Are the
> semantics of XSL such that one would be more easily and compactly
> notated than the other?

We've been trying in XSL to make both equally convenient.

James [Clark]

Date: Tue, 20 Apr 1999 13:33:26 -0700
From: Andrew Layman <andrewl@microsoft.com>
To: "xml-dev Mailing List (E-mail)" <xml-dev@ic.ac.uk>
Subject: Use of Tags

Regarding use of elements versus attributes, Andy Dent wrote "The path that
Microsoft seem to be following with XML-Data is to use elements ... My
single biggest problem with this is the reuse of elements within other
elements - you can't define an element with local 'scope'. What happens when
Amount is an i2 in one context and a float in another?"

At http://www.w3.org/TandS/QL/QL98/pp/microsoft-serializing.html you'll find
a description of a style of using XML in which attributes play a major role,
specifically to avoid the problem you mention with local scope.  

This particular style is designed for representing graphs of typed objects
in named relations using currently-available tools and technology. If
Microsoft's advocacy of this seems less than dogmatic, it is because other
contexts may reasonably call for other styles.

Best wishes,

Andrew Layman
Architect
Microsoft

Date: Mon, 26 Jun 2000 12:36:57 -0400
From: Kevin Williams <Kevin.Williams@ULTRAPRISE.COM>
Reply-To: General discussion of Extensible Markup Language
     <XML-L@LISTSERV.HEANET.IE>
To: XML-L@LISTSERV.HEANET.IE
Subject: Re: ELEMENTS vs ATTRIBUTES, which is prefered and why is prefered ?

> I may not be able to explain this well but I also saw no use
> for attributes until I had to script against the DOM.  In your example, if
> you find the element node <note> the date is available without searching for child
> elements.  Because of this, attributes make my searches so
> much easier.

This is probably the single biggest theological issue in the design of XML
structures. Early on, I reached the same conclusion Lynda has - namely, that
accessing attributes with the DOM or SAX is a heck of a lot easier than
accessing text elements. Performance-wise, it doesn't seem to make that much
of a difference - at least on the MS and IBM parsers - but simpler code is
IMO a good thing. There are some other advantages to using attributes for
data points that make them appealing to me:

- Attributes are unordered. If I'm building an XML document, and I have a
serial stream containing my data points that I'm converting to XML, it's
nice to be able to add the points in an ad-hoc way rather than having to add
them to the document in the order specified by the DTD. There's no right
answer to a question like, "Does name come before or after SSN?"

- Using attributes for data points disambiguates structure and information.
Code is much cleaner when using attributes for data points - attributes
always contain data points, and elements always contain structure. Contrast
this with the use of elements for data points, when element handling
routines must check to see what the children of an element consist of to
determine whether an element contains a data point or further structure.

- When extracting information from an XML document to store to an RDBMS, or
vice-versa, using attributes for data points forms a very clean mapping
between the systems - attributes always correspond to columns, while
elements always correspond to tables. This makes code to import and export
data between RDBMS systems and XML documents easy to write and very
flexible.

- Using attributes for data points results in a drastically smaller document
representing the same information - as much as 30% to 40% smaller, depending
on the mix of structure and information in your document.

Note that these comments only apply when the XML structures are used to hold
*data* - for XML being used to mark up text, an element-only model works
much better.

Unfortunately, a lot of others in the industry - bigger wheels than me -
disagree vehemently. For example, MS's BizTalk guidelines specify that all
information in Biztalk-compliant structures should be represented by
text-only elements. But for internal usages, or usages where custom XML is
being developed for a fixed-scope effort, I prefer to use attributes.

- Kevin

Kevin Williams
XML Architect, Ultraprise Corporation
Co-author: _Professional XML_ (Wrox Press)
Co-author: _ASP 3.0 Programmer's Reference_ (Wrox Press)
Co-author: _Professional VB XML_ (Wrox Press)


SEARCH \| ABOUT \| INDEX \| NEWS \| CORE STANDARDS \| TECHNOLOGY REPORTS \| EVENTS \| LIBRARY

Elements versus Attributes XML-DEV Discussion - 1998 and Following...

Elements versus Attributes

XML-DEV Discussion - 1998 and Following...