In this part of the assignment, you will create some SPARQL queries over your FOAF files and related data.
A good introduction to SPARQL (again by Leigh Dodds) is this tutorial. You can look at the W3C Recommendation for SPARQL. You might also want to consult this handy cheatsheet for SPARQL syntax.
Finally, here is more tutorial material on SPARQL:
There are a number of SPARQL implementations around. So far, the best one I have found is ARQ, which is based on the Jena toolkit. ARQ is very easy to install; see ARQ download and read the accompanying documentation; and the ARQ Tutorial, which is also pretty good, and links to a few other resources. In the rest of this document, I will assume that you are going to use ARQ.
Unfortunately, the Python-based rdflib implementation of SPARQL is incomplete and I don’t recommend using it for this exercise.
If you don’t want to download any files (e.g., ARQ), you could instead try the web-based SPARQLing query interface (screenshots of SPARQLing query and SPARQLing results). Be warned that the XSLT transform file suggested on the SPARQLing web page is no longer accessible; use http://www.w3.org/TR/rdf-sparql-XMLres/result-to-html.xsl instead.
One of the easiest ways to run SPARQL queries is using ARQ, which is a wrapper round Jena. ARQ is available on DICE at:
/usr/share/java/jena/bin/arq
ARQ is very easy to install; see http://jena.sourceforge.net/ARQ for more information, including download instructions and a nice tutorial.
Here are the installation instructions if you download your own copy.
First, you need to set environment variable JENAROOT. If you are in the root of the unzipped distibution, you can do this:
export JENAROOT=$PWD
On DICE, you’d do:
export JENAROOT=/usr/share/java/jena
Second, if it is your own installation, ensure all scripts are executable:
chmod u+x $JENAROOT/bin/*
Here’s an example of a simple SPARQL query:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name1 ?name2
FROM <http://homepages.inf.ed.ac.uk/ewan/foaf.rdf>
WHERE {
?person1 foaf:knows ?person2 .
?person1 foaf:name ?name1 .
?person2 foaf:name ?name2 .
}
Most of this should be familiar to you, but the``FROM`` clause may be new. This says that the query should be run against the RDF data to be found at http://homepages.inf.ed.ac.uk/ewan/foaf.rdf. Note that the URI has to be addressable via HTTP when the query is executed.
Let’s assume that we have installed ARQ in a directory ARQ-2.6.2. To call the ARQ SPARQL query engine, we use the following instruction on the command-line:
% ARQ-2.6.2/bin/sparql --query example-01.rq
Given my FOAF file, ARQ-2.6.2/bin/sparql will print the following output to the terminal:
---------------------------------
| name1 | name2 |
=================================
| "Ewan Klein" | "Harry Halpin" |
---------------------------------
We are allowed to have more than one FROM clause in a query, and the resulting graphs are merged. This is shown in the next example, where we query both my FOAF file and Harry Halpin’s.
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name1 ?name2
FROM <http://homepages.inf.ed.ac.uk/ewan/foaf.rdf>
FROM <http://www.ibiblio.org/hhalpin/foaf.rdf>
WHERE {
?person1 foaf:name ?name1 ;
foaf:knows [ foaf:name ?name2 ];
}
As you can observe here, SPARQL triple patterns allow the same abbreviatory syntax as we have already seen for N3, though strictly speaking SPARQL uses a subset of N3 called Turtle.
Running the query gives the following result set:
------------------------------------
| name1 | name2 |
====================================
| "Harry Halpin" | "Ivan Herman" |
| "Harry Halpin" | "Dan Connolly" |
| "Harry Halpin" | "Ian Davis" |
| "Harry Halpin" | "Paolo Bouquet" |
| "Ewan Klein" | "Harry Halpin" |
------------------------------------
The next query is run against my facts-plus-ontology file for bread. In this case, I want to recover instances that either have rdf:type masws:Bread or else have rdf:type C, where C is a subclass of masws:Bread. In order to match these two alternatives, we use the UNION keyword, as shown here:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX masws: <http://www.inf.ed.ac.uk/teaching/courses/masws/ontology#>
SELECT ?bread ?class ?tags
FROM <http://homepages.inf.ed.ac.uk/ewan/masws/rdf/combined.rdf>
WHERE
{
?bread dc:subject ?tags .
{
?bread a ?class .
?bread a masws:Bread .
}
UNION
{
?bread a ?class.
?class rdfs:subClassOf masws:Bread .
}
}
Note that the first occurrence of ?bread a ?class. does not act to restrict the selected values, but just allows us to display a ?class value for every combination of ?bread with ?tags. This is shown in the results:
---------------------------------------------------------------------------------------------
| bread | class | tags |
=============================================================================================
| masws:multigrainbread | masws:Bread | "bread multigrain spelt barley kamut rye wheat" |
| masws:oatmealbread | masws:Bread | "bread wholegrain wheat" |
| masws:ciabatta | masws:Ciabatta | "bread ciabatta sourdough" |
| masws:sourdoughbread | masws:Sourdough | "bread sourdough rye" |
---------------------------------------------------------------------------------------------
So far, we have just focussed on SELECT, which returns a table of results. However, we can also use SPARQL’s CONSTRUCT keyword to build a new RDF graph for us, based on a graph template which may contain variables. This is illustrated in the next example.
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dlcs: <http://del.icio.us/>
PREFIX masws: <http://www.inf.ed.ac.uk/teaching/courses/masws/ontology#>
PREFIX : <http://homepages.inf.ed.ac.uk/ewan/foaf.rdf#>
CONSTRUCT { ?person foaf:interest ?href .
?person foaf:name ?name . }
FROM <http://homepages.inf.ed.ac.uk/ewan/foaf.rdf>
FROM <http://homepages.inf.ed.ac.uk/ewan/masws/rdf/combined.rdf>
WHERE {
?person foaf:interest ?topic .
?topic foaf:maker ?person .
?person foaf:name ?name .
?subj dlcs:href ?href .
{
?subj a masws:Bread .
?subj dlcs:href ?href .
}
UNION
{
?subj a ?class.
?class rdfs:subClassOf masws:Bread .
}
}
Like the previous example, we look for resources which are instances of masws:Bread or of a subclass of masws:Bread. However, there are a number of other conditions. Thus, ?person foaf:interest ?topic, and ?topic foaf:maker ?person will be satisfied if there is a person ?person whose foaf:interest is some resource which has a representation of which ?person is the foaf:maker.
Here is the graph, in Turtle syntax, that is returned by the query:
@prefix masws: <http://www.inf.ed.ac.uk/teaching/courses/masws/ontology#> .
@prefix dlcs: <http://del.icio.us/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <http://homepages.inf.ed.ac.uk/ewan/foaf.rdf#> .
:ehk
foaf:interest <http://www.armchair.com/recipe/ryebread.html> ;
foaf:interest <http://www.all-creatures.org/recipes/bread-mg-sbkrw.html> ;
foaf:interest <http://www.recipezaar.com/107986> ;
foaf:interest <http://www.sourdoughhome.com/ciabatta.html> ;
foaf:name "Ewan Klein" .
For this part of the assigment, I want you to do the following:
Hand-in: A hard-copy version of your SPARQL query, plus the trace of the results return from running the query.
On completion, you should be able to: