Introduction to RDF Query with SPARQL
Part 1 - Introduction
Learning Goals
After this part of the tutorial you should
- know the requirements and use-cases of SPARQL
- understand the background to RDF query
- understand the differences between querying XML and RDF
- understand the Turtle/N3 RDF data format
- understand the tutorial example RDF data
- be able to use the online SPARQL tools
Overview
- What is SPARQL?
- Background of RDF Query Languages
- Requirements and Use Cases
- Querying XML and RDF
- Example RDF Data in Turtle/N3
- Walkthrough of online demonstration
What Is SPARQL?
- SPARQL Protocol and RDF Query Language
- A Query Language ...:
Find names and websites of contributors to PlanetRDF:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?website
FROM <http://planetrdf.com/bloggers.rdf>
WHERE { ?person foaf:weblog ?website ;
foaf:name ?name .
?website a foaf:Document
}
- ... and a Protocol.
http://.../qps?
query-lang=http://www.w3.org/TR/rdf-sparql-query/
&graph-id=http://planetrdf.com/bloggers.rdf
&query=PREFIX foaf: <http://xmlns.com/foaf/0.1/...
- Run this query (ex1.rq)
SPARQL
- SPARQL is an RDF query language over a protocol
Background of RDF Query Languages
- RDF: W3C RECommendation February 1998
- Revised RECommendations February 2004
- Querying at W3C been discussed since QL'98 workshop in December 1998
- Querying XML (XQuery) been developed from September 1999
- Implementors have lead RDF query language development
Why do you need a query language (for RDF)?
- Support all of the RDF model
- Knows about RDF graphs and RDF triples
- Can handle RDF's semi-structured data
- Supports operations on RDF graphs
- To enable higher-level application development
- To enable cross-language, cross-platform development
RDF Query Language Design
The main RDF query language styles:
- SQL-like: RDQL/Squish, SeRQL, RDFDB QL, RQL, ...
- XPath-like: Versa, RDFPath
- Rules-like: N3QL, Triple, DQL, OWL-QL, ...
- Language-like: Algae2, Fabl, Abeline
- Using XML: XSLT, XPath, XQuery
Most popular by far are the SQL-like languages.
RDQL/Squish is the most popular SQL-like language with multiple
independent implementations.
Querying XML compared to Querying RDF
Concept |
XML |
RDF |
Model |
Document or Tree or Infoset (PSVI) |
Set of Triples
= RDF Graph |
Atomic Units |
Elements, Attributes, Text |
Triples, URIs, Blank Nodes, Text |
Identifiers |
Element/Attribute names
QNames
IDs
XPointers / XPaths
|
URIs |
Described by |
DTDs
W3C XML Schema
Relax NG
... |
RDF Schema |
Comparing XML and RDF query
- Different models
- XML query cannot handle all RDF terms
XSLT / XQuery and RDF
- Not based on RDF terms - graphs, URIs and literals
- XQuery is based on sequences, cannot represent a set of triples
- Syntax focused - so more useful at SPARQL output (later)
- XQuery is still being developed (1999-present)
Why a standard RDF query language now?
given that RDQL (and others) are widely implemented
- Querying RDF has been discussed since 1998
- Implementations of RDQL are similar but have some divergence and extensions
- Recent research with contexts and Named Graphs are not in all current QLs
- XML query solutions exist but do not entirely suit RDF.
So in March 2004 the W3C formed the
RDF Data Access Working Group
Use Cases for RDF Query
Some of the use cases DAWG recorded were:
- Finding values for partially known graph structures
- Getting information about an identifiable object with unknown properties
- A human friendly syntax for queries for application developers
- Running automated regular queries against RDF graphs
- Querying aggregated RDF graphs
- Running queries constrained with datatype expressions
- Querying a remote RDF server and getting streaming results back
- Allowing alternate solutions to match in queries
- Using local extension functions in a query
- Using an RDF query service with Web Services
Requirements for RDF query
These led to requirements
- from existing languages:
- conjunction (AND) of triple patterns with variable bindings
and constraints
- from use cases:
- graphs, datatypes, extension functions, aggregation,
alternates, descriptions
- from the WG charter:
- a protocol.
no rules language, no cursors, no proofs and no updates
SPARQL - Query Language
- An RDF data access query language
- Data access means reading information, not writing (updates)
- Outline query model is graph patterns
- In style of earlier RDQLish work (i.e. not rules, not path based)
SPARQL - Protocol
- Services running SPARQL queries over a set of graphs
- A transport protocol for invoking the service
- Based on ideas from earlier protocol work such as Joseki
- Describing the service with Web Service technologies
Turtle RDF syntax - URIs and Blank Nodes
- URIs
- Enclosed in <>
Relative URI references turned in URIs (RFC 3986)
<
URI>
- or
@prefix
prefix <http://....>
prefix:
name
in the style of XML QNames as a shorthand for the full URI
- Blank Nodes
_:
name
- or
[]
for a Blank Node used once
Turtle RDF syntax - RDF Literals
- Literals
"
Literal"
"
Literal"@
language
"""
Long literal with
newlines"""
- Datatyped Literals
" lexical form"^^ datatype URI
e.g. "10"^^xsd:integer | |
10 | Decimal integer (xsd:integer ) |
true | Boolean (xsd:boolean ) |
2.5 | Double (xsd:double ) |
Turtle RDF Syntax - Triples and Abbreviations
- Triples separated by
.
:a :b :c . :d :e :f .
- Common triple predicate and subject:
:a :b :c, :d .
which is the same as :a :b :c . :a :b :d .
- Common triple subject:
:a :b :c; :d :e .
which is the same as: :a :b :c . :a :d :e .
- Blank node as a subject
:a :b [ :c :d ]
which is the same as: :a :b _:x . _:x :c :d .
for blank node _:x
- RDF Collections
:a :b ( :c :d :e :f )
which is short for many triples
Example Dataset of people and who they know
Example Dataset - detail
- Alice Bob Celine Dan Eve and Fred
- 5 people described in 6 documents containing RDF graphs
foaf:knows
relationship between people
- Identifying properties:
foaf:mbox
- descriptive properties:
foaf:name
- linking properties:
rdfs:seeAlso
Example document from Dataset
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix bio: <http://purl.org/vocab/bio/0.1/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<> rdf:type foaf:PersonalProfileDocument ;
dc:creator <mailto:alice@work.example.com> .
_:a a foaf:Person ;
foaf:name "Alice" ;
foaf:mbox <mailto:alice@work.example.com> ;
foaf:mbox <mailto:alice@home.example.org> ;
foaf:knows _:b ;
bio:event [ a bio:Birth; bio:date "1974-02-28"^^xsd:date ] .
_:b a foaf:Person ;
foaf:name "Bob" ;
foaf:mbox <mailto:bob@work.example.com> ;
bio:event [ a bio:Birth; bio:date "1973-01-02"^^xsd:date ] ;
rdfs:seeAlso <bobFoaf.rdf> ;
rdf:type foaf:Person .
<bobFoaf.rdf> rdf:type foaf:PersonalProfileDocument .
_:a foaf:knows _:c .
_:c foaf:mbox <mailto:eve@work.example.com> .
aliceFoaf.ttl
Note:
- Using multiple vocabularies declared with
@prefix
- This is semi structured data
- blank nodes are connecting the people nodes
SPARQL querying online
Using:
SPARQL querying on the console with ARQ
Download and install:
- Windows Setup:
- Set environment variable
ARQROOT
to the absolute
location of the ARQ-0.95 folder. Add the ARQ-0.95/bin folder to
the console search PATH
.
- Unix, Linux, OSX, Cygwin Setup:
- Set environment variable
ARQROOT
to
the absolute path of the ARQ-0.9.5/lib directory. Add
the ARQ-0.9.5/bin directory to the shell search PATH
.
Make all the scripts are executable with chmod u+x bin/*
.
SPARQL querying on the console with ARQ
$ sparql --query http://www.w3.org/2004/Talks/17Dec-sparql/intro/ex1.rq
-------------------------------------------------------------------------
| name | website |
=========================================================================
| "Norm Walsh" | <http://norman.walsh.name/> |
| "Dave Beckett" | <http://journal.dajobe.org/journal/> |
| "Ikki Ohmukai" | <http://www.semblog.org/> |
| "W3C Semantic Web News" | <http://www.w3.org/2001/sw/> |
| "Danny Ayers" | <http://dannyayers.com/> |
...
-------------------------------------------------------------------------
or with XML results:
$ sparql --results rs/xml --query http://www.w3.org/2004/Talks/17Dec-sparql/intro/ex1.rq
<?xml version="1.0"?>
<sparql xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.w3.org/2001/sw/DataAccess/rf1/result2" >
<head>
<variable name="name"/>
<variable name="website"/>
</head>
<results>
<result>
...