Elements versus Attributes
XML-DEV Discussion - 1998 and Following...
In April 1998, several messages were posted to XML-DEV relating to principles that might be used to decide whether an XML encoding should (best) use elements or attributes. Some of these are collected below, together with a few subsequent posts. See also: "When Should I Use Elements, and When Should I Use Attributes?" and the database section "Elements versus attributes - How Do I Decide?"
From owner-xml-dev@ic.ac.uk Mon Apr 6 18:00:54 1998 Date: Mon, 6 Apr 1998 15:51:49 -0700 (PDT) From: Roy Tennant <rtennant@library.berkeley.edu> To: xml-dev@ic.ac.uk Subject: When is an attribute an attribute? I've been trying to figure this out for a while with no success. It seems to me that there are several quite different ways one can encode information in XML. Are all of the following correct? When and why would you choose one over another? Does it matter? Thank you for your indulgence as I puzzle out what must surely be readily apparent to most of you. Example 1: --------- <BOOK TITLE="The Call of the Wild" AUTHOR="London, Jack"\> Example 2: --------- <BOOK AUTHOR="London, Jack">The Call of the Wild</BOOK> Example 3: --------- <BOOK> <TITLE>The Call of the Wild</TITLE> <AUTHOR>London, Jack</AUTHOR> </BOOK> Thanks, Roy Tennant xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From owner-xml-dev@ic.ac.uk Mon Apr 6 19:47:41 1998 Date: Mon, 06 Apr 1998 19:35:04 -0500 From: len bullard <cbullard@hiwaay.net> Subject: Re: When is an attribute an attribute? Sender: owner-xml-dev@ic.ac.uk To: Roy Tennant <rtennant@library.berkeley.edu> Cc: xml-dev@ic.ac.uk Reply-to: len bullard <cbullard@hiwaay.net> Message-id: <352974B8.2964@hiwaay.net> Organization: Blind Dillo MIME-version: 1.0 X-Mailer: Mozilla 3.01 (Win95; I) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7bit Precedence: bulk References: <Pine.OSF.3.96.980406154414.5362G-100000@library.berkeley.edu> Status: RO Roy Tennant wrote: > > I've been trying to figure this out for a while with no success. It seems > to me that there are several quite different ways one can encode > information in XML. Are all of the following correct? Yes. > When and why would > you choose one over another? Does it matter? Thank you for your indulgence > as I puzzle out what must surely be readily apparent to most of you. Ok, a DTD really helps this sort of discussion along, but FWIW: > Example 1: > --------- > > <BOOK TITLE="The Call of the Wild" AUTHOR="London, Jack"\> Use empty elements and attributes for tag bags, basically, if the datum has no frequency and order requirements (only occurs once somewhere in the attribute list). NOTE: I haven't looked to see if XML dropped the SGML restriction on repeated values in attlist decls. > Example 2: > --------- > > <BOOK AUTHOR="London, Jack">The Call of the Wild</BOOK> Use this if you don't care that the string inside the tags is only differentiated by the BOOK, that is, semantically, there is no difference between this and <BOOK AUTHOR="London, Jack">Love that Wolf!!</BOOK> or IOW, your application has to know that is a title. > Example 3: > --------- > > <BOOK> > <TITLE>The Call of the Wild</TITLE> > <AUTHOR>London, Jack</AUTHOR> > </BOOK> Use this when it is important to know there is a title and author (i.e, this BOOK HAS-A TITLE, HAS-A AUTHOR; the string, The CALL of the WILD IS-A TITLE). Given the element type declaration, you can tell which order they should come in, are there multiple authors, are there alternate titles, etc. The semantic is application dependent. For a linking semantic, you might be counting nodes inside the BOOK. For rendering, you might be assigning the font value based on the context of the book element. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From owner-xml-dev@ic.ac.uk Mon Apr 6 19:57:15 1998 From: Jim Amsden <jamsden@us.ibm.com> To: <xml-dev@ic.ac.uk> Subject: Re: When is an attribute an attribute? Message-ID: <5040100016970517000002L072*@MHS> Date: Mon, 6 Apr 1998 20:50:31 -0400 MIME-Version: 1.0 Content-Type: text/plain Sender: owner-xml-dev@ic.ac.uk Precedence: bulk Reply-To: Jim Amsden <jamsden@us.ibm.com> Status: RO I think it's best to treat this as an object modeling problem first, and then an XML representation. The distinction between attribute and content element then becomes the distinction between an attribute and a containment relationship with another object. Object attributes are atomic, referentially transparant characteristics of an object that have no identity of their own. Generally this corresponds to primitive data types, but this can be somewhat arbitrary too (e.g., Strings, Date, etc.). Taking a more logical view, an attribute names some characteristic of an object that models part of its internal state, and is not considered an object in its own right. That is, no other objects have relationships to an attribute of an object, but rather to the object itself. So if the thing you want to capture has internal structure of its own, or can be referenced through a link, or can be contained in more than one element, then its an element, otherwise it's probably an attribute. Note that attributes have a numer of advantages over content elements: 1. they can have names that indicate the role the value plays in the element. Element contents have content names, but there is no way to say what role the content plays in any particular element that contains it. 2. attributes can have default values. 3. attributes have (minimal) data types 4. attributes take up less space as there is no need for an end tag 5. attributes are easier to access in DOM. There are also some disadvantages: 1. attributes aren't as convenient for large values, or binary entities. 2. values containing quotes can be a bother. 3. attributes can't contain other elements. This isn't really a disadvantage, but part of what it means to be atomic. 4. white space can't be ignored in an attribute. My recommendation is to use attributes unless you can't, and certainly use them to avoid mixed data content in elements whenever possible. The idea is to encapsulate as much as you can in an individual object but not too much. Use the principles of data normalization, they work fine here too. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From owner-xml-dev@ic.ac.uk Mon Apr 6 20:21:04 1998 Date: Tue, 07 Apr 1998 11:12:38 +1000 From: Rick Jelliffe <ricko@allette.com.au> Subject: Re: When is an attribute an attribute? Sender: owner-xml-dev@ic.ac.uk To: xml-dev@ic.ac.uk Reply-to: Rick Jelliffe <ricko@allette.com.au> Message-id: <005101bd61c2$7e02bbe0$a30b4ccb@NT.JELLIFFE.COM.AU> From: Jim Amsden <jamsden@us.ibm.com> > I think it's best to treat this as an object modeling problem first, and then > an XML representation Without going against object modeling or any other view, you should first be aware of any constraints in XML (Len's comment) and in your immediate software. If your editing software makes attributes easy, then use attributes. If your rendering/draft software does not support attributes well, use elements. You can do a simple translation of your DTD and document to convert from one form to another anyway. A DTD does not need to be set in stone. A markup language has to reconcile data modeling needs and human factors, with the latter being the most important. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From owner-xml-dev@ic.ac.uk Mon Apr 6 22:29:52 1998 Date: Mon, 6 Apr 1998 20:20:26 -0700 Message-Id: <199804070320.UAA00399@unready.microstar.com> From: David Megginson <ak117@freenet.carleton.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit To: xml-dev@ic.ac.uk Subject: When is an attribute an attribute? In-Reply-To: <Pine.OSF.3.96.980406154414.5362G-100000@library.berkeley.edu> References: <Pine.OSF.3.96.980406154414.5362G-100000@library.berkeley.edu> Roy Tennant writes: > I've been trying to figure this out for a while with no success. It > seems to me that there are several quite different ways one can > encode information in XML. Are all of the following correct? When > and why would you choose one over another? Does it matter? Thank > you for your indulgence as I puzzle out what must surely be readily > apparent to most of you. It's not self-evident, and everyone has their own strongly-held opinions. Database people are tempted to force everything into attributes, because attributes are (slightly) typed while character data is not. Generally, though, you need to consider the following: - attribute values are harder to search for in search engines - attribute values often don't appear on the screen in editing tools (you have to open a special dialog or popup to see them) - attribute values can have no substructure - attribute values can be slightly more awkward to access in processing APIs - attributes are unordered, so there is no standard way to specify that one attribute's value should precede the other's (there is no guarantee that an API will give you the attributes in the same order that you specified them) My rule is to use attributes in markup just as I would use footnotes or endnotes in a book -- to provide extra information that is not part of the main content, but that is useful to know about it. By this rule, all of your examples are correct, but under different circumstances. > Example 1: > --------- > > <BOOK TITLE="The Call of the Wild" AUTHOR="London, Jack"\> In this case, all that really matters is that there's a book there. An XML document author might see <BOOK> in the main editing window, but get the attribute values in a pop-up only by clicking the mouse. It's not essential to know the book's title or author, and it is unlikely that anyone would want to search for it. Yes: insurance company list of property to be replaced; customs list of objects declared at border No: online bookstore; library catalogue > Example 2: > --------- > > <BOOK AUTHOR="London, Jack">The Call of the Wild</BOOK> In this one, the title matters but the author is just extra information. You'd probably use this for encoding a title inline, where the title will be printed as part of the paragraph (possibly in italics), but the author's name would appear only in a separate index or popup. <PARA>I enjoyed the book <BOOK AUTHOR="London, Jack">The Call of the Wild</BOOK>.</PARA> > Example 3: > --------- > > <BOOK> > <TITLE>The Call of the Wild</TITLE> > <AUTHOR>London, Jack</AUTHOR> > </BOOK> In this one, both the title and author are important -- you'd use this for the citation line of a quotation, in a bibliography, at an online bookstore, or in a library catalogue. I hope this helps. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From owner-xml-dev@ic.ac.uk Mon Apr 6 18:22:19 1998 Date: Mon, 06 Apr 1998 16:07:23 -0700 From: "Daniel B. Austin" <daniela@cnet.com> Subject: Re: When is an attribute an attribute? In-reply-to: <Pine.OSF.3.96.980406154414.5362G-100000@library.berkeley.e du> Sender: owner-xml-dev@ic.ac.uk X-Sender: daniela@cnet5.cnet.com To: Roy Tennant <rtennant@library.berkeley.edu> Cc: xml-dev@ic.ac.uk Reply-to: "Daniel B. Austin" <daniela@cnet.com> Message-id: <199804062312.QAA10316@central.cnet.com> Hi, All three of your examples below are well-formed. The decision as to whether properties of document objects are to be encoded as attributes or as element content is up to you; there is no clear cut answer. (You might note that your example #2 below doe not provide as much information as examples #1 & 3, because it does not specify that the <BOOK> element's content is a title...it could be anything.) Here are some considerations that may inform your decisions regarding attributes and elements: a) does the document property relate to the structure of the document? If yes then an element would provide better use. b) are your target documents going to be large in terms of file size? If so, an attribute might be a better choice. c) is the processor/display device you are using better or faster at parsing one or the other? d) does the property apply to many elements in your document? ie. in book.xml the title might only show up once, or once at the bottom of each page. e) Does the author find it easier to add an element or an attribute or does it matter? In general I would make the case that properties that are used often and are non-structural in nature would be best defined as attributes and others as elements. Regards, D- At 03:51 PM 4/6/98 -0700, you wrote: >I've been trying to figure this out for a while with no success. It seems >to me that there are several quite different ways one can encode >information in XML. Are all of the following correct? When and why would >you choose one over another? Does it matter? Thank you for your indulgence >as I puzzle out what must surely be readily apparent to most of you. > >Example 1: >--------- > ><BOOK TITLE="The Call of the Wild" AUTHOR="London, Jack"\> > >Example 2: >--------- > ><BOOK AUTHOR="London, Jack">The Call of the Wild</BOOK> > >Example 3: >--------- > ><BOOK> > <TITLE>The Call of the Wild</TITLE> > <AUTHOR>London, Jack</AUTHOR> ></BOOK> > >Thanks, >Roy Tennant Daniel Austin daniela@cnet.com Director of Development, Corporate Creative Services CNET: The Computer Network (415) 395-7800 x1438 "To change the old into the new, and the shapes of things to come..." xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From owner-xml-dev@ic.ac.uk Mon Apr 6 18:24:59 1998 Date: Mon, 6 Apr 1998 19:12:09 -0400 To: Roy Tennant <rtennant@library.berkeley.edu> From: Murray Maloney <murray@muzmo.com> Subject: Re: When is an attribute an attribute? Cc: xml-dev@ic.ac.uk In-Reply-To: <Pine.OSF.3.96.980406154414.5362G-100000@library.berkeley.e du> There is no real "right" way to encode something like your example. Any of the examples that you have offered is just as likely as the other, and an application that works with any of them is just as likely to succeed in meeting its objectives. However, if you wanted to distinguish between a family and given name, and maybe add an honorific or an accreditation, you might want to use an element with subelements for the author. Using a comma in the name requires a second-level parse. An advantage of using nested subelements is that you can avoid a second level parse. Otherwise, as I said, there is no "right" answer. At 06:51 PM 4/6/98 -0400, Roy Tennant wrote: >I've been trying to figure this out for a while with no success. It seems >to me that there are several quite different ways one can encode >information in XML. Are all of the following correct? When and why would >you choose one over another? Does it matter? Thank you for your indulgence >as I puzzle out what must surely be readily apparent to most of you. > >Example 1: >--------- > ><BOOK TITLE="The Call of the Wild" AUTHOR="London, Jack"\> > >Example 2: >--------- > ><BOOK AUTHOR="London, Jack">The Call of the Wild</BOOK> > >Example 3: >--------- > ><BOOK> > <TITLE>The Call of the Wild</TITLE> > <AUTHOR>London, Jack</AUTHOR> ></BOOK> > >Thanks, >Roy Tennant > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Murray Maloney Email: murray@muzmo.com Technical Director Phone: (905) 509-9120 Veo Systems Fax: (905) 509-8637 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Make a Tax-Deductible Donation Yuri Rubinsky Insight Foundation http://www.yuri.org/donate.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From owner-xml-dev@ic.ac.uk Tue Apr 7 09:28:42 1998 Message-ID: <352A367E.B4FCEF1C@infinet.com> Date: Tue, 07 Apr 1998 10:21:50 -0400 From: Tyler Baker <tyler@infinet.com> X-Mailer: Mozilla 4.04 [en] (WinNT; U) MIME-Version: 1.0 To: xml-dev@ic.ac.uk Subject: Re: When is an attribute an attribute? References: <3.0.1.32.19980406191209.0070f5b0@pop.uunet.ca> I asked the same question about 4 months ago concerning using attributes vs. elements on this list and got some interesting answers. In that time I have found that for modeling objects a few principles come to mind. If you are modeling an object which will never change at all (like a Rectangle) then you would be best to do something like this: <RECTANGLE x="0" y="0" width="0" height="0"/> The rationale for this approach over using elements is that in most XML processors you will get all of the attribute values at once that are necessary for generally immutable objects like Rectangle's. In a particular application of mine I found that I would call setBounds() in java.awt.Component 4 times using the element approach vs. only once with the attributes approach. If you are representing something whose type may evolve over time like a user profile in a database, then the element approach I feel works better in the long run... Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From owner-xml-dev@ic.ac.uk Tue Apr 7 12:41:56 1998 Date: Tue, 07 Apr 1998 10:35:38 -0700 From: Rich Koehler <RKoehler@able-inc.com> Subject: RE: When is an attribute an attribute? Sender: owner-xml-dev@ic.ac.uk To: xml-dev@ic.ac.uk Reply-to: Rich Koehler <RKoehler@able-inc.com> Message-id: <30511AC98761D111976F0060082DE6E90A3265@cascade.able-inc.com> MIME-version: 1.0 X-Mailer: Internet Mail Service (5.0.1458.49) I've become fond of the method that Tim Bray used to distinguish between elements and attributes in his discussion of MCF (http://www.textuality.com/mcf/MCF-tutorial.html). He writes, "...when the property has a simple value like a string, we put that in the content of the element; when the property's value is another object, we put a pointer to it in an attribute value and leave the element decribing the property empty." This allows the creation of a directed linked graph, where objects refer to other objects, and the links can have attributes of their own. In your case it might look like this: <BOOK ID="The Call of the Wild"> <AUTHOR UNIT="Jack London"/> </BOOK> Which allows you to define something like this: <PERSON ID="Jack London"> <FIRST>Jack</FIRST> <LAST>London</LAST> <PHONE>(206) 555-3423</PHONE> <WORK UNIT="The Call of the Wild"/> <WORK UNIT="Love those Wolves"/> </PERSON> Where the ID attributes are unique tokens for each object, and the UNIT attributes point to other objects. In this case we see that Jack London is a PERSON, who in the context of the book "The Call of the Wild" is an AUTHOR. Jack may appear in other objects, in other contexts, like: <STORE ID="Wal-Mart"> <CUSTOMER UNIT="Jack London"/> .... I think RDF will eventually address this. Anyway, that's my personal preference. Rich xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From owner-xml-dev@ic.ac.uk Tue Apr 7 23:48:44 1998 From: "Rick Jelliffe" <ricko@allette.com.au> To: <xml-dev@ic.ac.uk> Subject: Re: When is an attribute an attribute? Date: Wed, 8 Apr 1998 14:41:36 +1000 -----Original Message----- From: len bullard <cbullard@hiwaay.net> > The funniest thing I've seen lately is a statement > on the Microsoft XML site that XML gets rid of > committees who design DTDs in favor of a > more "organic" approach. Lots of luck. ;-) :-) My book <plug type="shameless">The XML & SGML Cookbook</plug>, due out next month, looks at this issue. In particular it gives some basic patterns and considerations that can be used for "rapid prototyping" a document type. Most document types require some rethought after deployment. Very few people actually have much of an idea of what their data contains. Anyway, when you start actually using markup systems you will want to make maximal use of the particular tools you have bought. So even if a DTD was created without any consideration of the software to be used, there is often good reason to enhance the DTD to make best use of the particular capabilities of the appliciations (and to overcome flaws that turn up). DTDs made by committees often tend to be rather kitchen-sinkish. But this is better dealt with by dividing them into separate DTDs (especially for front and backmatter), which are more manageable, or by introducing "training-wheel" DTDs which won't scare people off, rather than by saying they are over-engineered. Documents and publications are much more complicated than people want to accept: sometimes the only way is for people to learn by being given a simple DTD and then having issues in their documents prove to them that a larger DTD is actually what they require. "Organic" is an attractive word. Being able to make ad hoc changes to DTDs is great if you are processing them, or if you have a family of documents which are similar but not exactly the same type. SGML systems have suffered in the past because DTD-alterations was often a large-scale exercise for gurus. XML is doing good things in making this more difficult. But the idea that XML markup declarations are inherently inflexible, while declaration-less XML allows more "organic" development is spurious. One trick SGML people use (this is adapted from Travis and Waldt's book) is to make explicit element types for unaccounted-for elements. This gives you somewhere to park important data in the absense of DTD elements. This kind of flexibility is available in any DTD: you don't need to abandon XML markup declarations to get it. For example, the following declaration is a good basis for such an element type: <!ELEMENT new ANY > <!-- "class" is the name the user might suggest for this element type if in a DTD. "HTMLform" is the nearest HTML element type, to help rendering. --> <!ATTLIST new id ID #IMPLIED class CDATA #REQUIRED HTMLform CDATA #IMPLIED comment CDATA #IMPLIED> ... <new class="dog" HTMLform="em">Rover</new> (Check out the HTML span and div elements too.) Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From owner-xml-dev@ic.ac.uk Tue Apr 7 10:44:21 1998 Date: Tue, 07 Apr 1998 11:24:39 -0400 From: Murray Maloney <murray@muzmo.com> Subject: Re: When is an attribute an attribute? At 10:21 AM 4/7/98 -0400, Tyler Baker wrote: >If you are modeling an object which will never change at all >(like a Rectangle) then you would be best to do something like this: > ><RECTANGLE x="0" y="0" width="0" height="0"/> > This is a very good example of when attributes are optimal. In this case, the attributes are object properties, rather than children of the object. Even so, a RECTANGLE element could use containment to better advantage for cases where there are many, possibly disjoint name/value pairs or collections. <RECTANGLE> <ORIGIN><X>0</X><Y>0</Y></ORIGIN> <SIZE><DX>7in</DX><DY>9in</DY></SIZE> <LABEL>My Pretty Rectangle</LABEL> <IMAGE>floral.jpeg</IMAGE> <BACKGROUND>gold</BACKGROUND> <FOREGROUND>blue</FOREGROUND> <FORM><SUBMIT/></FORM> <ETC>...</ETC> </RECTANGLE> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Murray Maloney Email: murray@muzmo.com Technical Director Phone: (905) 509-9120 Veo Systems Fax: (905) 509-8637 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Make a Tax-Deductible Donation Yuri Rubinsky Insight Foundation http://www.yuri.org/donate.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From owner-xml-dev@ic.ac.uk Tue Apr 7 18:56:26 1998 Message-ID: <352AB962.7CF4@hiwaay.net> Date: Tue, 07 Apr 1998 18:40:18 -0500 From: len bullard <cbullard@hiwaay.net> Organization: Blind Dillo X-Mailer: Mozilla 3.01 (Win95; I) MIME-Version: 1.0 To: Rich Koehler <RKoehler@able-inc.com> CC: xml-dev@ic.ac.uk Subject: Re: When is an attribute an attribute? References: <30511AC98761D111976F0060082DE6E90A3265@cascade.able-inc.com> Rich Koehler wrote: > > I've become fond of the method that Tim Bray used to distinguish between > elements and attributes in his discussion of MCF > (http://www.textuality.com/mcf/MCF-tutorial.html). He writes, "...when > the property has a simple value like a string, we put that in the > content of the element; when the property's value is another object, we > put a pointer to it in an attribute value and leave the element > decribing the property empty." Neat! As others have pointed out, much depends not on the abstraction of the modeling technique, but on the method to be applied to the markup (ie, the application). If I want a tracking system for the person, the pointer techniques are good. If I want to render a title or find all titles, then the explicit element declaration is good. BTW: All of this is why DTDs have worked well for so many years. They are a contract between implementors and systems. The funniest thing I've seen lately is a statement on the Microsoft XML site that XML gets rid of committees who design DTDs in favor of a more "organic" approach. Lots of luck. ;-) len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
Date: Mon, 24 Aug 1998 22:26:53 -0500 From: Paul Prescod <papresco@technologist.com> To: xml-dev@ic.ac.uk Samuel R. Blackburn wrote: > > My rule of thumb is attributes contains data that is unique to that > particular object. Do I understand you correctly if I parapharse your post as "attributes are only useful for unique identifiers?" That's as good a "rule of thumb" as any other, I guess, but it is a much more radical one than any I have heard before. It seems just as arbitrary as any other rule of thumb I've heard. Paul Prescod - http://itrc.uwaterloo.ca/~papresco
Date: Mon, 24 Aug 1998 18:05:03 -0500 (CDT) From: Robin Cover <robin@acadcomp.sil.org> To: xml-dev@ic.ac.uk Subject: Re: Newbie Q Frank Blau <fblau@nina.snohomish.wa.gov> wrote: > Is there a formal rule for the use of Atrributes vs Elements? Some hints are recorded in the documents with these URLs: http://xml.coverpages.org/elementAttr9804.html http://xml.coverpages.org/elementsAndAttrs.html These documents are referenced from the dedicated section: http://xml.coverpages.org/topics.html#elementsAndAttrs Elements versus attributes - How do I decide? > that Attributes are > best used to communicate information to the browser/application, and > Elements are best used for actual Data. Is this a valid assumption? I do not think this assumption has any basis whatever in the XML 1.0 specification, and it certainly has no basis in the parent standard, ISO 8879. There is some basis in HTML browser behavior, but that is (in my opinion) a Bad Thing, and not to be perpetuated as a standard agreement. It is dangerous for all the same reasons as in "HTML": the industry got stuck with hard-coded application processing semantics. XML encoding itself should used with the semantic opacity that the specification implies, in my judgment; styles and other (separate) processing specifications should determine how/whether certain (character) data in an XML document is acted upon (displayed, suppressed, etc.). The distinction between "information [for the browser/application]" and "actual Data" is specious, at least from the perspective of XML itself. Why? because what you think is your "metadata" (today) will become your data tomorrow; your metadata is someone else's data even today. "Metadata" does not map conceptually to attribute - for many reasons, the most obvious of which is that some metadata is highly complex, and cannot be structured using SGML/XML attributes (flat strings). And, of course, we know that XML is now being designed for use in all kinds of ('Internet') applications which do not involve any "browser" or human viewing of the data in its tagged format; DTDs are generated from database schemas, and tagged data is passed between applications (then discarded) without being seen by humans. My 2 cents. The documents referenced above contain excellent summaries of considerations that might be taken into account when deciding how to model your data in a markup representation. -robin
From: "James Tauber" <jtauber@jtauber.com> To: "Frank Blau" <fblau@nina.snohomish.wa.gov>, <xml-dev@ic.ac.uk> Subject: Re: Newbie Q Date: Tue, 25 Aug 1998 18:22:37 +0800 >Is there a formal rule for the use of Atrributes vs Elements? The >assumption I am going on (per The XML Primer) is that Attributes are >best used to communicate information to the browser/application, and >Elements are best used for actual Data. Is this a valid assumption? > >In an EDI transaction, I was going to put the Header and Trailer >information in attributes, with the actual Detail Segments as >Elements... > >Any thoughts? Definitely see Robin Cover's page on this [cited by another responder]. It seems to me that issue of Attributes vs Elements becomes trickier as the thing you are marking up becomes more data-like and less document-like. The value of attributes are technically markup rather than content (at least by my reading of the spec) so the clearer the distinction is between what should be content and what should be markup, the clearer the attribute vs element issue is. This isn't too bad when you are marking up already existing content but it gets progressively worse as the markup language is used less and less for 'marking up' and more and more for other things. As my paper at SGML/XML Asia Pacific (or is it XML Asia now?) will discuss, one is always drawing the line between content and markup on an application by application basis. Content generally contains (or perhaps *is*) markup that's not XML. Consider spaces between words. They are a form of (non-XML) markup. They are also a presentational style. In some applications (such as corpus linguistics) word boundaries are marked up and it is a stylesheet issue to display the spaces. The moral is that even an important distinction like markup vs content vs presentation depends on the application. James
From owner-xml-dev@ic.ac.uk Tue Aug 25 12:49:07 1998 From: Dean Roddey <roddey@us.ibm.com> To: <xml-dev@ic.ac.uk> Subject: Re: Newbie Q Date: Tue, 25 Aug 1998 13:47:46 -0400 > Is there a formal rule for the use of Atrributes vs Elements? The > assumption I am going on (per The XML Primer) is that Attributes are > best used to communicate information to the browser/application, and > Elements are best used for actual Data. Is this a valid assumption? > > In an EDI transaction, I was going to put the Header and Trailer > information in attributes, with the actual Detail Segments as > Elements... > > Any thoughts? There are also practical issues I guess. If you need to validate the document with a DTD, there is much more control available to control the content of subelements. You can say that the parent element's content model allows this or that, or this, that or the other, etc... With attributes, there is less flexibility. Each attribute is either there or not. There is no way to indicate that if you provide this one, you can't provide that one, or if these two are present, then you can't have that one, or if this one is present, then that one has to also be present, and so on (at least there is no way that I know of :-) Beyond that I think its purely an issue of what works best for what you want to do. And there is sometimes an issue of readability and writeability if the documents are ever dealt with by actual humans <gasp> :-) Attributes with large values are kind of funky (IMHO) to make very readable, so I'd make any property that could have large values an element if all other things were equal (which of course they often aren't.) Overall, I guess the best rule of thumb is that attributes should hold stuff that either you need to get to fast without wanting to iterate the sub elements, or which provide 'control information' as you indicated, or which need to have ID/IDREF semantics enforced, or that you might want to provide implicit defaults for, and probably some others that I can't think of :-) Dean Roddey
From owner-xml-dev@ic.ac.uk Tue Aug 25 23:22:02 1998 Date: Wed, 26 Aug 1998 10:12:51 +0700 From: James Clark <jjc@jclark.com> To: xml-dev@ic.ac.uk Subject: Re: Newbie Q Dean Roddey wrote: >>I do not think this assumption has any basis whatever in the XML 1.0 >>specification, and it certainly has no basis in the parent standard, >>ISO 8879. There is some basis in HTML browser behavior, but that is >>(in my opinion) a Bad Thing, and not to be perpetuated as a standard >>agreement. It is dangerous for all the same reasons as in "HTML": >>the industry got stuck with hard-coded application processing semantics. >>XML encoding itself should used with the semantic opacity that the >>specification implies, in my judgment; styles and other (separate) >>processing specifications should determine how/whether certain >>(character) data in an XML document is acted upon (displayed, suppressed, >> etc.). > So does anyone have any opinions on whether something like XSL will > be more convenient to deal with attributes than elements? Are the > semantics of XSL such that one would be more easily and compactly > notated than the other? We've been trying in XSL to make both equally convenient. James [Clark]
Date: Tue, 20 Apr 1999 13:33:26 -0700 From: Andrew Layman <andrewl@microsoft.com> To: "xml-dev Mailing List (E-mail)" <xml-dev@ic.ac.uk> Subject: Use of Tags Regarding use of elements versus attributes, Andy Dent wrote "The path that Microsoft seem to be following with XML-Data is to use elements ... My single biggest problem with this is the reuse of elements within other elements - you can't define an element with local 'scope'. What happens when Amount is an i2 in one context and a float in another?" At http://www.w3.org/TandS/QL/QL98/pp/microsoft-serializing.html you'll find a description of a style of using XML in which attributes play a major role, specifically to avoid the problem you mention with local scope. This particular style is designed for representing graphs of typed objects in named relations using currently-available tools and technology. If Microsoft's advocacy of this seems less than dogmatic, it is because other contexts may reasonably call for other styles. Best wishes, Andrew Layman Architect Microsoft
Date: Mon, 26 Jun 2000 12:36:57 -0400 From: Kevin Williams <Kevin.Williams@ULTRAPRISE.COM> Reply-To: General discussion of Extensible Markup Language <XML-L@LISTSERV.HEANET.IE> To: XML-L@LISTSERV.HEANET.IE Subject: Re: ELEMENTS vs ATTRIBUTES, which is prefered and why is prefered ? > I may not be able to explain this well but I also saw no use > for attributes until I had to script against the DOM. In your example, if > you find the element node <note> the date is available without searching for child > elements. Because of this, attributes make my searches so > much easier. This is probably the single biggest theological issue in the design of XML structures. Early on, I reached the same conclusion Lynda has - namely, that accessing attributes with the DOM or SAX is a heck of a lot easier than accessing text elements. Performance-wise, it doesn't seem to make that much of a difference - at least on the MS and IBM parsers - but simpler code is IMO a good thing. There are some other advantages to using attributes for data points that make them appealing to me: - Attributes are unordered. If I'm building an XML document, and I have a serial stream containing my data points that I'm converting to XML, it's nice to be able to add the points in an ad-hoc way rather than having to add them to the document in the order specified by the DTD. There's no right answer to a question like, "Does name come before or after SSN?" - Using attributes for data points disambiguates structure and information. Code is much cleaner when using attributes for data points - attributes always contain data points, and elements always contain structure. Contrast this with the use of elements for data points, when element handling routines must check to see what the children of an element consist of to determine whether an element contains a data point or further structure. - When extracting information from an XML document to store to an RDBMS, or vice-versa, using attributes for data points forms a very clean mapping between the systems - attributes always correspond to columns, while elements always correspond to tables. This makes code to import and export data between RDBMS systems and XML documents easy to write and very flexible. - Using attributes for data points results in a drastically smaller document representing the same information - as much as 30% to 40% smaller, depending on the mix of structure and information in your document. Note that these comments only apply when the XML structures are used to hold *data* - for XML being used to mark up text, an element-only model works much better. Unfortunately, a lot of others in the industry - bigger wheels than me - disagree vehemently. For example, MS's BizTalk guidelines specify that all information in Biztalk-compliant structures should be represented by text-only elements. But for internal usages, or usages where custom XML is being developed for a fixed-scope effort, I prefer to use attributes. - Kevin Kevin Williams XML Architect, Ultraprise Corporation Co-author: _Professional XML_ (Wrox Press) Co-author: _ASP 3.0 Programmer's Reference_ (Wrox Press) Co-author: _Professional VB XML_ (Wrox Press)