RDF: Megginson's Examples
Date: 24 Nov 1999 09:46:40 -0500 From: David Megginson <david@megginson.com> To: "'xml-dev@ic.ac.uk'" <xml-dev@ic.ac.uk> Subject: Re: RDF, again
Paul Prescod <paul@prescod.net> writes: > The thing I find confusing about the RDF syntax is that the element > type name can be either an RDF type name or an RDF property. XML > makes no distinction and that's why I think that it is difficult to > use for object oriented interchange. Your example doesn't run into > that problem really because it only goes one level deep. But what > does the RDF for this CSS-style object representation look like: > > person{ > name: person-name{ first: "Paul"; last: "Prescod"}; > address: snail-mail-address{ > street: street-address{ > number: 5936; > street: "Lovers Lane" > city: city( #!Dallas ); > state: state( #!Texas )}; > siblings: #sibling1 #sibling2 #sibling3} > > (curly braces for structures, parentheses for primitive types, > concatenation for lists, semicolons for property separators) Without minimization (except for rdf:type as the classname), you get something like this, which is fully normalized and thus suitable B2B data exchange: <!-- Example 1 of 3 (almost no minimization) --> <megg:Person rdf:ID="sibling0"> <megg:name rdf:resource="#id002"/> <megg:address rdf:resource="#id003"/> <megg:sibling rdf:resource="#sibling1"/> <megg:sibling rdf:resource="#sibling2"/> <megg:sibling rdf:resource="#sibling3"/> </megg:Person> <megg:PersonName rdf:ID="id002"> <megg:firstName>Paul</megg:firstName> <megg:lastName>Prescodd</megg:lastName> </megg:PersonName> <megg:SnailMailAddress rdf:ID="id003"> <megg:street rdf:resource="#id005"/> <megg:city rdf:resource="http://www.places.org/us/tx/dallas/"/> <megg:state rdf:resource="http://www.places.org/us/tx/"/> </megg:SnailMailAddress> <megg:StreetAddress rdf:ID="id005"> <megg:number>5936</megg:number> <megg:streetName>Lovers Lane</megg:streetName> </megg:StreetAddress> With a bit of minimization (and some denormalization), you get something like this: <!-- Example 2 of 3 (moderate minimization) --> <megg:Person rdf:ID="sibling0"> <megg:name> <megg:PersonName> <megg:firstName>Paul</megg:firstName> <megg:lastName>Prescod</megg:lastName> </megg:PersonName> </megg:name> <megg:address> <megg:SnailMailAddress> <megg:street> <megg:StreetAddress> <megg:number>5936</megg:number> <megg:streetName>Lovers Lane</megg:streetName> </megg:StreetAddress> </megg:street> <megg:city rdf:resource="http://www.places.org/us/tx/dallas/"/> <megg:state rdf:resource="http://www.places.org/us/tx/"/> </megg:SnailMailAddress> </megg:address> <megg:sibling rdf:resource="#sibling1"/> <megg:sibling rdf:resource="#sibling2"/> <megg:sibling rdf:resource="#sibling3"/> </megg:Person> With really ferocious minimization, you can get down to this (but you lose some class names): <!-- Example 3 of 3 (maximum minimization) --> <megg:Person rdf:ID="sibling0"> <megg:name firstName="Paul" lastName="Prescod"/> <megg:address rdf:parseType="Resource"> <megg:street number="5936" streetName="Lovers Lane"/> <megg:city rdf:resource="http://www.places.org/us/tx/dallas/"/> <megg:state rdf:resource="http://www.places.org/us/tx/"/> </megg:address> <megg:sibling rdf:resource="#sibling1"/> <megg:sibling rdf:resource="#sibling2"/> <megg:sibling rdf:resource="#sibling3"/> </megg:Person> Certainly, it's easy for a person to read at this level, but it's quite tricky to process; I'm surprised that several RDF processors have actually come out. The RDF committee had to make a difficult choice: to what extent should they complicate the syntax to help get buy-in from the initial implementors? When the original SGML committee had to make that choice, strong vested interests (especially in publishing, from what I've heard) were able to force a horrendous complexity on the ISO 8879:1986 grammar. When the XML committee had to make the same choice, they held they line much better, but vested interests (especially in the SGML world) were able to force them to include some kruft like notations and external unparsed entities. When the RDF committee had to make the same choice, strong vested interests (especially in the HTML world) were able to force them to include heavy optional minimization and a some bizarre kruft like the rdf:aboutEachPrefix attribute. No International Standard *or* consortium spec is free from this kind of horse trading -- it's just a fact of life. In the end, the kruft in XML didn't hurt it all that much, while the syntactic kruft in SGML pretty much did it in. The jury is still out on RDF syntax, but the convoluted syntax puts them dangerously close to the edge. > Note also that common XML usage puts datatypes in the schema or > elsewhere. To recognize an integer as such you need the schema or some > other external knowledge. Some DTDs have <int> elements but that's so > ugly that it hasn't really "caught on." The XML world is very > inconsistent in its thought about the appropriateness of dependence on > the schema. I think that that dependence is slowly creeping back into > vogue. I think that it was their intention to do so, but they were waiting for some general XML datatyping facility. > Speaking on behalf of the devil, I'd say that in one week we could > define (or just find) an S-expression-like language with none of these > weaknesses and in less time we could write a parser for it. It could > have "XML element" as a primary data type for embedded XML and could > also be embedded IN XML. Not only could we, but many of us have -- I've written quite a few thousand lines of LISP in my life, and I know that it works fine for representing data structures, but nobody uses it. XML also works fine, and everyone uses it. So, let's get on to the interesting stuff, and actually start doing something with information rather than just marking it up. All the best, David David Megginson david@megginson.com http://www.megginson.com/
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
Prepared by Robin Cover for the The SGML/XML Web Page archive.