This is a proposal for a stripped down syntax based on RDF, as an application of XML.
Many people have expressed an interest in having a highly simplified, but powerful, serialization of RDF in XML. To that end, I decided it would be beneficial to come up with a proposal - leaving as many of the current preconceptions behind as possible - which sets out a language which is as simple and intuitive to understand and use.
The abstract syntax of this language is indeed very simple, consisting as it does of a set of only three elements and a handful of attributes. The elements are:-
<t>
, denoting a triple;<po>
, denoting a predicate an object pair;<o>
, denoting an objectEach of these elements has a range of attributes that associate a URI with
a particular part of the content. The "id
" attribute always
identifies a URI Reference to be used in a particular way, depending on which
element it is present (see below). As in RDF, an empty id attribute
corresponds to the URI of this document. The compliment of the
"id
" attribute is the "qname
" attribute, which
basically references a QName.
When an id/qname attribute is present on an unnested
<t>
element, it gives the URI Reference of the subject.
When one of these attributes is used on a <po>
element, it
gives the URI Reference of the predicate (of the parent
<t>
element). When one of these attributes is used on an
<o>
element, it gives the URI of the object (of the parent
<t>
element). When one of these attributes is used on a
<t>
element that is the child element of a
<po>
element, it becomes the object of that old triple,
and the subject for the new one.
The set of things called "literals" in RDF are basically represented as
data:
URIs in BSWL. Literals may be referred to using a special
bswl:literal
attribute. This directly identifies the literal,
not the thing that it is a lexical representation of. The attribute content
of these attributes must be IRI escaped by
processors deriving a model from the content. For example:-
<bswl:o bswl:literal="my literal"/>
must become the URI:-
data:,my%20literal .
Content model. A <t>
can only contain
<po>
element(s). A <po>
can contain
either the <t>
or <o>
elements. The
<o>
element is empty.
All elements may be abbreviated to an element QName instead of using the
qname
attribute. This is very useful, as we shall see later
on.
A simple sample of this syntax (unabbreviated) is:-
<bswl:bswl xmlns:bswl="http://purl.org/net/bswl/2001#" xmlns:my="http://myns.org/#" > <bswl:t bswl:id="http://infomesh.net/Sean#sbp"> <bswl:po bswl:qname="my:name"> <bswl:o bswl:literal="Sean B. Palmer"/> </bswl:po> </bswl:t> </bswl:bswl>
This is basically equivalent to, what might be represented in XML RDF using the RDF M&S recommendation as:-
<rdf:RDF xmlns="http://infomesh.net/Sean#" xmlns:my="http://myns.org/#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="http://infomesh.net/Sean#sbp"> <my:name>Sean B. Palmer</my:name> </rdf:Description> </rdf:RDF>
There are no typed nodes (do that using properties). There are no anonymous nodes (q.v. Anonymous Node: No More).
Multiple objects can be placed inside a predicate. The ordering is not important.
<bswl:bswl xmlns:bswl="http://purl.org/net/bswl/2001#" xmlns="http://example.org/#" > <bswl:t bswl:qname="MyTerm"> <bswl:po bswl:qname="myProp"> <bswl:o bswl:qname="a"/> <bswl:o bswl:qname="b"/> </bswl:po> </bswl:t> </bswl:bswl>
Corresponds to (in RDF M&S):-
<rdf:RDF xmlns="http://example.org/#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="http://example.org/#MyTerm"> <myProp rdf:resource="http://example.org/#a"/> <myProp rdf:resource="http://example.org/#b"/> </rdf:Description> </rdf:RDF>
And in Notation3:-
:MyTerm :myProp :a , :b .
QNames on BSWL elements can actually be abbreviated to simply that QName as an element. So, for example:-
<bswl:t bswl:qname="Sean"> <bswl:po bswl:qname="likes"> <bswl:o bswl:qname="Chocolate"/> </bswl:po> </bswl:t>
becomes:-
<Sean> <likes> <Chocolate/> </likes> </Sean>
It is easy to work out what each particular element does simply by counting the nesting:-
and so on.
Anonymous nodes are no longer allowed. They must be given URIs. For example:-
:Sean :likes [ :called "Chocolate" ] .
Must now be (using abbreviated syntax):-
<Sean> <likes> <anon:a> <called> <bswl:o bswl:literal="Chocolate"/> </called> </anon:a> </likes> </Sean>
Which is really:-
:Sean :likes anon:a . anon:a :called "Chocolate" .
For example, people could create an anonymous URI in the
"tag:
" URI space (once registered) that can be used.
To talk about XML QNames, two further attributes are introduced: element and attribute. These are both basically the same as QName, except they talk about the ExpEType and ExpAName QNames in XMLNS for XML respectively.
For example, the element "element" in XSD can be talked about like so:-
<bswl:t bswl:element="xsd:element"> <name> <bswl:o bswl:literal="Element"/> </name> </bswl:t>
The attribute "my:blargh
" that appears on
"my:element
" can be represented as:-
<bswl:o bswl:attribute="my:blargh" bswl:element="my:element"/>
To talk about literals as subjects, use dswl:literal
on the
<t>
element.
It is an error to mix certain attributes: only the following variants are allowed:-
<bswl:o bswl:id="x"/>
<bswl:o bswl:qname="x"/>
<bswl:o bswl:literal="x"/>
<bswl:o bswl:element="x"/>
<bswl:o bswl:attribute="x"/>
<bswl:o bswl:element="x" bswl:attribute="x"/>
<bswl:o bswl:attribute="x" bswl:element="x"/>
(same as above)The following is a BNF for both the abbreviated and non-abbreviated BSWL syntaxes:-
t = ( '<' QName '>' po+ '</' QName '>' | '<bswl:t ' attribs '>' po+ '</bswl:t>' ) po = ( '<' QName '>' ( t | o )+ '</' QName '>' | '<bswl:po ' attribs '>' ( t | o )+ '</bswl:po>' ) o = ( '<' QName '/>' | '<bswl:o ' attribs '/>' ) attribs = ( id_a | qname_a | literal_a | element_a | attribute_a | element_a ' ' attribute_a | attribute_a ' ' element_a ) id_a = 'bswl:id="' anyURI '"' qname_a = 'bswl:qname="' QName '"' literal_a = 'bswl:literal="' string '"' element_a = 'bswl:element="' QName '"' attribute_a = 'bswl:attribute="' QName '"'
For just the non-abbreviated form of BSWL, the BNF is even shorter:-
t = '<bswl:t ' attribs '>' po+ '</bswl:t>' po = '<bswl:po ' attribs '>' ( t | o )+ '</bswl:po>' o = '<bswl:o ' attribs '/>' attribs = ( id_a | qname_a | literal_a | element_a | attribute_a | element_a ' ' attribute_a | attribute_a ' ' element_a ) id_a = 'bswl:id="' anyURI '"' qname_a = 'bswl:qname="' QName '"' literal_a = 'bswl:literal="' string '"' element_a = 'bswl:element="' QName '"' attribute_a = 'bswl:attribute="' QName '"'
If XML Schema would allow us to choose attribute groups, we could specify an XML Schema for this as well. Perhaps I can do it in RELAX NG. (See the non-normative hacking in Appendix B).
The root of the document :-
bswl = '<bswl:bswl xmlns="http://purl.org/net/bswl/2001#" ' ns* '>' t+ '</bswl:bswl>' ns = 'xmlns:' id '="' anyURI '" '
Now for a little bit about the model (as soon as I have an immediate use for BSWL that involves reification).
Statements are as in RDF. If each statement is given a unique ID, then these may be referred to in a statement set (just intersect the statements).
This can be referred to as a subject predicate or object.
:Sean earl:asserts { :MyPage earl:passes :CP1 } .
becomes:-
<Sean> <earl:asserts> <MyAssertion> <bswlm:s><MyPage></bswlm:s> <bswlm:p><earl:passes/></bswlm:p> <bswlm:o><CP1/></bswlm:o> </MyAssertion> </earl:asserts> </Sean>
Which is short for:-
<bwsl:t bwsl:qname="Sean"> <bwsl:po bwsl:qname="earl:asserts"> <bwsl:t bwsl:qname="MyAssertion"> <bwsl:po bwsl:qname="bswlm:s"> <bwsl:o bwsl:qname="MyPage"/> </bswlm:po> <bwsl:po bwsl:qname="bswlm:p"> <bwsl:o bwsl:qname="earl:passes"/> </bswlm:po> <bwsl:po bwsl:qname="bswlm:o"> <bwsl:o bwsl:qname="CP1"/> </bswlm:po> </bswl:t> </bswl:po> </bswl:t>
really:-
:Sean earl:asserts :MyAssertion . :MyAssertion bswl:s :MyPage; bswl:p earl:passes; bswl:o :CP1 .
All appendices and sub sections thereof are non-normative unless otherwise stated.
Examples of seralizing the examples in RDF M&S
In XML RDF:-
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:s="http://description.org/schema/"> <rdf:Description about="http://www.w3.org/Home/Lassila"> <s:Creator>Ora Lassila</s:Creator> </rdf:Description> </rdf:RDF>
In BSWL:-
<bswl:bswl xmlns:bswl="http://purl.org/net/bswl/2001#" > <bswl:t bswl:id="http://www.w3.org/Home/Lassila"> <bswl:po bswl:id="http://description.org/schema/Creator"> <bswl:o bswl:literal="Ora Lassila"/> </bswl:po> </bswl:t> </bswl:bswl>
In XML RDF:-
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:v="http://example.org/#" > <rdf:Description about="http://www.w3.org/Home/Lassila"> <s:Creator rdf:resource="http://www.w3.org/staffId/85740"/> </rdf:Description> <rdf:Description about="http://www.w3.org/staffId/85740"> <v:Name>Ora Lassila</v:Name> <v:Email>lassila@w3.org</v:Email> </rdf:Description> </rdf:RDF>
In BSWL:-
<bswl:bswl xmlns:bswl="http://purl.org/net/bswl/2001#" xmlns:v="http://example.org/#" > <bswl:t bswl:id="http://www.w3.org/Home/Lassila"> <bswl:po bswl:id="http://description.org/schema/Creator"> <bswl:t bswl:id="http://www.w3.org/staffId/85740"> <v:Name><bswl:o bswl:literal="Ora Lassila"/></v:name> <v:EMail><bswl:o bswl:literal="lassila@w3.org"/></v:Email> </bswl:t> </bswl:po> </bswl:t> </bswl:bswl>
Half finished RELAX NG thing for the unabbreviated syntax:-
<grammar xmlns="http://relaxng.org/ns/structure/0.9" ns="http://purl.org/net/bswl/2001#" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <start><ref name="bswl"/></start> <define name="bswl"> <element name="bswl"> <oneOrMore> <element ref="t"/> </oneOrMore> </element> </define> </grammar>
Given up on this because it'll only work for the non-abbreviated syntax anyway.
I've hacked up an EARL example using the abbreviated syntax.
<Sean> <likes> <bswl:o bswl:literal="Chocolate"/> </likes> </Sean>
Is the same as:-
<Sean xmlns:data="data:,"> <likes> <data:Chocolate/> </likes> </Sean>
Note that this proposal allows one to refer to XML QNames as part of the syntax, whereas the ones above do not. This is a very lightweight specification, and I hope there will be little ambiguity. There are no anonymous nodes, so that reduces another problem (although it introduces the problem of how to convert XML RDF M&S to BSWL).
Using the abbreviated syntax, BSWL files tend to be shorter than XML RDF using RDF M&S files.
BSWL allows the use of URI references, and literals that become URI references in certain attributes.
Note that some characters are disallowed in URI references, even if they
are allowed in XML. These disallowed characters include all of the non-ASCII
characters, plus the excluded characters listed in Section 2.4 of RFC 2396, except for the
number sign (#
) and percent sign (%
) and the square
bracket characters re-allowed in RFC 2732.
Disallowed characters must be escaped in BSWL as follows:-
%HH
, where
HH
is the hexadecimal notation of the byte value).Because it is impractical for any application to check that an attribute value is a URI reference, this specification follows the lead of RFC 2396 in this matter and imposes no such conformance testing requirement on BSWL applications.
Sean B. Palmer