RDF Inference Language (RIL)

[Draft version referenced on 'www-rdf-interest@w3.org' mailing list; see http://lists.fourthought.com/pipermail/ril/2001-May/000037.html. A version in DocBook format is also available.]

Mike Olson (Fourthought, Inc.)Mike Olson (Fourthought, Inc.)
Revision (Initial revision) [MO]


1. Abstract
2. Introduction
2.1. RIL versions
3. Script
4. Queries
4.1. Predicate query
4.2. Logical combination of RIL predicates
4.3. Quantifiers
4.4. Utility operations
5. Actions
6. Expressing rules
7. Appendix A: Use-cases
7.1. Match triples on patterns
7.2. Simple relationship traversal
7.3. Complex relationship traversal
7.4. Path traversal
7.5. Complex path traversal
7.6. Forward chaining inference
7.7.
7.8.
7.9.
7.10.
8. References

Abstract

This specification defines the syntax and semantics of RDF Inference Language (RIL), a means of expressing expert systems rules and queries that operate on RDF models. RIL is an open format for complex queries on a directed graph established for knowledge representation.

Introduction

RIL is allows the definition of rules and the composition of queries that operate on an RDF model. This is done using an XML vocabulary for instructions to the RIL processor.

Where used in this document, the keywords "SHOULD", "MUST", and "MUST NOT" are to be interpreted as described in RFC 2119 [RFC2119]. However, for readability, these words do not appear in all uppercase letters in this specification.

In this version of the specification the ril namespace prefix shall be used to represent the namespace reference http://xmlns.rdfinference.org/ril/0/2.

In addition, several namespaces will be used as examples in this specification. To improve readibility, these namespaces will be represented using XML namespace QNames with common prefixes as shown in the following table.

Please note that all these prefixed abbreviations are mere convenience, and that any such qualified names must be interpreted as if the prefix represents the full namespace URI.

RIL versions

The version of RIL specification used in a particular instruction is indicated using its namespace.

The version of RIL represented by this specification is 0.2, and conforming instances should use the http://xmlns.rdfinference.org/ril/0/2 namespace, but processing applications can generically refer to RIL instances as http://xmlns.rdfinference.org/ril, which is interpreted as the highest available version of RIL, or http://xmlns.rdfinference.org/ril/0/2, which is interpreted as the highest version of RIL that is lesser than version 1.0. This is in accordance with the namespace URI versioning scheme suggested by Eric van der Vlist [EV200103]

In this specification the term "RIL namespace" refers to the namespace that identifies the RIL vocabulary: http://xmlns.rdfinference.org/ril/0.1.

Script

A set of RIL instructions to be executed in concert is written as a RIL script. A script comprises a list of top level instructions, each of which may define a query or action. An instruction might optionally produce a resulting value. The RIL script itself is an instruction whose result is a list of all the results specified by enclosed ril:set-result instructions.


   <ril:script
     id = [unique identifier]
   >
     <!-- Content: (top-level-element+) -->
   </ril:script>

    

Because expert-systems-like rules can be expressed as RIL scripts starting with a set of queries and concluding with a set of actions, ril:rule is allowed as a synonym for ril:script.

If a top-level element is in the RIL namespace, it must be one of the following instructions:

Any element in any other namespace is allowed at top-level element. Such elements are extension instructions, and must be handled according to the rules for extension processing.

Issue: should top-level elements themselves be allowed to be embedded in XML without being in a ril:expression??

Queries

A query allows one to request data from the RIL model.

Predicate query

The RIL model can be viewed as a collection of logical predicates. RIL allows a simple means of evaluating predicates for query.

Using the prolog functor syntax, a simple predicate could be represented as follows:

creator(newton)

This is a predicate of one argument. In the RIL model, this predicate is true if there exists an RDF statement with the predicate name as predicate and the predicate argument as object. This check is done regardless of the value of the object.

So in the above example creator(newton) is is true if and only if there exists an RDF statement with predicate "creator" and object "newton". A query based on this can be represented in RIL as in listing 1.


<ril:query>
  <dc:creator>
    <ril:value>newton</ril:value>
  </dc:creator>
</ril:query>

      

This query would evaluate to true given the following universe:


{b:principia, dc:creator, h:newton}
{b:aljabr, dc:creator, h:kwarizmi}
{b:principia, dc:date, ""}

      

The first statement is equivalent to the test predicate. The query can also be posed to retrieve a variable mapping rather than a plain boolean response as in listing 2.


<ril:query>
  <dc:creator>
    <ril:variable name="X"/>
  </dc:creator>
</ril:query>

      

This query operating in the above universe would result in the variable X being assigned to the following list:


[h:newton, h:kwarizmi]

      

Using the prolog functor syntax, a predicate with two arguments could be represented as follows:

creator(newton, principia)

In the RIL model, this predicate is true if and only if there exists an RDF statement with the predicate name as predicate, the first argument as object and the second argument as subject.

So in the above example creator(newton, principia) is is true if there exists an RDF statement with subject "principia", predicate "creator" and object "newton". A query based on this can be represented in RIL as in listing 3.


<ril:query>
  <dc:creator>
    <ril:value>b:principia</ril:value>
    <ril:value>h:newton</ril:value>
  </dc:creator>
</ril:query>

      

Which would also evaluate to true given the example universe. Listing 4 is an example of how this query can be cast so as to map values to variables.


<ril:query>
  <dc:creator>
    <ril:value>h:newton</ril:value>
    <ril:variable name="X"/>
  </dc:creator>
</ril:query>

      

Using the prolog functor syntax, a predicate with two arguments could be represented as follows:

creator(newton, principia, royal-acada-sciences)

Let us say this models the natural language "The royal academy of sciences asserts that Newton was the creator of the book Principia.

In the RIL model, this predicate is true if and only if there exists a reified RDF statement with the predicate name as predicate, the first argument as object and the second argument as subject; and there is also a statement with the first reified statement itself as subject and the third argument as object.

So in the above example creator(newton, principia) is is true if there exists an RDF statement with subject "principia", predicate "creator" and object "newton"; and this statement is reified and the subject of another statement with object "royal-acada-sciences".

Issue: is this flexible enough? Furthermore, it seems to lack congruence with the method of mapping predicates of 1 and 2 artuments. Finally, it's somewhat unclear how to extend this idea congruently to predicates of 4 or more arguments.

Logical combination of RIL predicates

A conjunction of RIL predicates can be expressed using the ril:and instruction. For instance listing 5 could be interpreted as the prose "Check whether uogbuji's last login was 2001-01-01 and there is a file of name 'happy-millenium.txt'".


<ril:query>
  <ril:and>
    <s:last-login>
      <ril:value>u:uogbuji</ril:value>
      <ril:value>2001-03-04</ril:value>
    </s:last-login>
    <s:name>
      <ril:value>happy-millenium.txt</ril:value>
    </s:name>
  </ril:and>
</ril:query>

      

Therefore it would return true for the following universe:


{u:uogbuji, s:last-login, "2001-03-04"}
{u:molson, s:last-login, "2001-01-01"}
{http://home/molson/happy-millenium.txt, s:file-name, "happy-millenium.txt"}
{http://home/molson/happy-millenium.txt, s:owner, u:molson}
{http://home//uogbuji/memo.doc, s:owner, u:uogbuji}
{http://home/uogbuji/memo.doc, s:name, "memo.doc"}

      

A conjunction over predicates with variable mappings results in each variable's being set to a list created by selecting each list item that appears in all of the predicates.

Issue: what about repeated items, such as identical strings?

For instance, take the prose: "Check whether anyone who owns a file named "happy-millenium.txt" had as last log-in date 1 January 2001". This query can be expressed as in listing 6.


<ril:query>
  <ril:and>
    <s:last-login>
      <ril:variable name="X"/>
      <ril:value>2001-01-01</ril:value>
    </s:last-login>
    <s:owner>
      <ril:variable name="Y"/>
      <ril:variable name="X"/>
    </s:owner>
    <s:file-name>
      <ril:variable name="Y"/>
      <ril:value>happy-millenium.txt</ril:value>
    </s:file-name>
  </ril:and>
</ril:query>

        

This query made on the above universe would result in the following variable mapping:


X = [ u:molson ]

      

This technique can be used for predicates based on resource type.


<ril:query>
  <ril:and>
    <rdf:type>
      <ril:variable name="X"/>
      <ril:value>urn:sysadmin.net:schema#File</ril:value>
    </rdf:type>
    <s:name>
      <ril:variable name="X"/>
      <ril:value>happy-millenium.txt</ril:value>
    </s:name>
  </ril:and>
</ril:query>

        

The query in listing 7 made on the following universe:


{u:uogbuji, rdf:type, urn:sysadmin.net:schema#User}
{u:uogbuji, s:name, "Uche Ogbuji"}
{u:uogbuji, s:last-login, "2001-01-01"}
{u:molson, rdf:type, urn:sysadmin.net:schema#User}
{u:molson, s:last-login, "2001-03-04"}
{u:molson, s:name, "Mike Olson"}
{http://home/molson/happy-millenium.txt, rdf:type, urn:sysadmin.net:schema#File}
{http://home/molson/happy-millenium.txt, s:name, "happy-millenium.txt"}
{http://home/molson/happy-millenium.txt, s:owner, u:molson}
{http://home//uogbuji/memo.doc, rdf:type, urn:sysadmin.net:schema#File}
{http://home/uogbuji/memo.doc, s:name, "memo.doc"}
{http://home//uogbuji/memo.doc, s:owner, u:uogbuji}

      

would result in the following variable mapping:


X = [ http://home/molson/happy-millenium.txt ]

      

RIL also provides the ril:rdf-type instruction for convenience in this case. Listing 8 is equivalent to listing 7.


<ril:query>
  <ril:and>
    <ril:rdf-type id="urn:sysadmin.net:schema#File">
      <ril:variable name="X"/>
    </ril:rdf-type>
    <s:file-name>
      <ril:value>happy-millenium.txt</ril:value>
      <ril:variable name="X"/>
    </s:file-name>
  </ril:and>
</ril:query>

        

Issue: is this abbreviation necessary?

A disjunction of RIL predicates can be expressed using the ril:or instruction. For instance listing 9 could be interpreted as the prose "Check whether uogbuji's last login was 2001-01-01 or there is a file of name 'happy-millenium.txt'".


<ril:query>
  <ril:or>
    <s:last-login>
      <ril:value>2001-01-01</ril:value>
      <ril:variable name="X"/>
    </s:last-login>
    <s:name>
      <ril:value>happy-millenium.txt</ril:value>
      <ril:variable name="X"/>
    </s:name>
  </ril:or>
</ril:query>

        

A disjunction over predicates with variable mappings results in each variable's being set to a list created by aggregating all valued represented in each variable set by the the predicates.

The query in listing 9 made on the above universe would result in the following variable mapping:


X = [u:uogbuji, http://home/molson/happy-millenium.txt ]

      

A negation of a RIL predicates can be expressed using the ril:not instruction. For instance listing 10 could be interpreted as the prose "Determine whether alkwarizmi is not the creator of Principia.".


<ril:query>
  <ril:not>
    <dc:creator>
      <ril:value>principia</ril:value>
      <ril:value>alkwarizmi</ril:value>
    </dc:creator>
  </ril:not>
</ril:query>

      

A negation over predicates with variable mappings results in a list, for each variable, of the possible values in the universe that did not match any child predicate of the negation, or that was not selected by any child conjunction or disjunction.

Issue: As it is, this could be a real performance pig. Do we allow variables in descendants of ril:not? Is this the right behavior in case we do? At any rate, this section needs better explanation and examples.

Issue: Should we support the remaining propositional logic operators: implication and equivalence at this low level? Both can be expressed using rules (see below), and allowing them at the atomic level could be a performance problem, but perhaps they're necessary.

Quantifiers

RIL supports first-order logic quantifiers: exists and for-all. These can be simulated using other RIL features such as the coercion of lists to boolean according to their emptiness, but the following are explicit quantifier instructions.

Issue: Actually, explicit quantifiers are probably no use if we simply assume that all variables are universally quantified over query predicates.

ril:exists returns true if there exist values, which if applied to its parameters, satisfy its child predicates.


   <ril:exists>
     <!-- Content: (ril:param-name+, [proposition]) -->
   </ril:exists>

      

Utility operations

RIL provides a set of utility operations that can operate on partial query results for refinement.

Returns a list of RDF statements that match pattern of subject, object and predicate.


   <ril:match>
     <!-- Content: (ril:subject? | ril:predicate? | ril:object?) -->
   </ril:match>

    

The returned statements will all have the given subject, predicate and object. If any of ril:subject, ril:predicate and ril:object child nodes are omitted, then they are treated as wildcards.

Issue: should we support regular expression matching of subject, predicate or object?

ril:path returns an ordered list of statements that form a complete path from a given subject to a given object. ril:reverse-path


   <ril:path>
     <!-- Content: (ril:start, ril:end) -->
   </ril:path>

    

Issue: there was originally a ril:reverse-path, but this isn't needed with a reverse-list primitive operation, right?

These instructions operate on the list of statements generated as a result of their child instructions, generating a list of strings as result. They return a list constructed by iterating over each of the statements and selecting the target attribute as a separate string.

ril:sort sorts a list according tot he given criteria.

ril:reverse reverses the order of a list.

ril:slice creates a sub-list.

ril:position returns the position of an item in a list

ril:intersection returns the set intersection obtained by treating two lists as sets (i.e. ignoring order and duplicates).

ril:union returns the set union obtained by treating two lists as sets (i.e. ignoring order and duplicates).

ril:filter creates a sub-list obtained by evaluating an expression against each item.

Actions

RIL provides a set of instructions to perform actions that may operate on query results.

ril:assert adds statements to either the workspace or an external RDF model.


   <ril:assert
     (on=[space-separated list of model IDs])?
   >
     <!-- Content: ([expression]+) -->
   </ril:assert>

    

The results of the child RIL expressions are interpreted as lists of statements, and are concatenated. The on attribute is optional. If it is omitted, the resultant statements are added to the RIL workspace. If not, attempts are made to add the statements to each model identified by the given list of ids in turn. Note that there is no guarantee that this addition will be successful: for instance, a RIL or RDF processor might employ access control, and the RIL user might not have permissions to modify the given models. If addition fails, error handling is processor-specific.

ril:message presents a message to the RIL user. The form of this message is processor-defined.


   <ril:message>
     <!-- Content: (mixed content with optional ril:var-ref elements) -->
   </ril:message>

    

Expressing rules

A rule allows the user to prescribe certain actions according to the state of the target RDF model and RIL workspace.

A rule consists of a premise and a conclusion. The premise executes a series of queries, and a series of variables are assigned values according to the processing of the given queries. The conclusion executes a series of actions which usually operate on the variables set up by the premise.

RIL does not provide any direct instructions for defining rules, but they can be defined by a script with a query followed by actions. For example:


<ril:rule id='sweat-check'>
  <ril:query>
    <ril:or>
      <ril:and>
        <ft:running>
          <ril:variable name='X'/>
        </ft:running>
        <ft:out-of-shape>
          <ril:variable name='X'/>
        </ft:out-of-shape>
      </ril:and>
      <ril:and>
        <ft:wet>
          <ril:variable name='X'/>
        </ft:wet>
        <ril:not>
          <ft:swimming>
            <ril:variable name='X'/>
          </ft:swimming>
        </ril:not>
      </ril:and>
    </ril:or>
  </ril:query>
  <ril:assert>
    <ft:sweating>
      <ril:variable name='X'/>
    </ft:sweating>
  </ril:assert>
</ril:rule>

        

Appendix A: Use-cases

The following are use-cases that provide examples of the sort of processing RIL is designed to handle. These uae-cases can be used as examples and test cases for RIL implementors.

Match triples on patterns

Get all resources of type document

Get all documents with dc:title of "RDF 101"

Simple relationship traversal

Get all people with a manager named "Bob"

Complex relationship traversal

All books written by an author from a town in Wisconsin.

All users or developers with a manager named "Bob"

Path traversal

The name of the second oldest ancestor of Joe

All companies that provide a weather service that is is trusted by the UN business registry.

Complex path traversal

TSP

Forward chaining inference

If there exists a person with any three of the following: e-mail address, a pager, and instant messaging account, PDA, cell phone, laptop, neural implant, then he is a geek.

If there is no pending litigation and the stock is rising and the stock has risen more than 30% in the past week, then buy.

If a person is running and out of shape or wet and not swimming, then he or she is sweating.

If an employee's attendance is in the bottom 10% percentile of the company then fire him.

References

[EV200103]
Eric van der Vlist proposed a scheme for namespace version management in this XML-DEV posting.
[]
[]
[]
[]
Rdf complete
The set of statements from a subject, predicate and object query on the Rdf Repository. Any or all of the three components can be wildcards, in which case the result set will return all statements that match