Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Data Extraction & Exploration With SPARQL & The Talis Platform

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 49

Data Extraction & Exploration with SPARQL & the Talis Platform

shared innovation

Agenda
Tutorial Schema Graph Patterns Simple SELECT queries OPTIONAL patterns UNION queries Sorting & Limiting Filtering & Restrictions DISTINCT SPARQL Query Forms Useful Links

shared innovation

Tutorial Schema

Based on NASA spaceflight data Available in: http://api.talis.com/stores/space

shared innovation

Triple and Graph Patterns


How do we describe the structure of the RDF graph which we're interested in?

shared innovation

#An RDF triple in Turtle syntax


<http://purl.org/net/schemas/space/spacecraft/1957-001B> foaf:name Sputnik 1.

shared innovation

#An SPARQL triple pattern, with a single variable


<http://purl.org/net/schemas/space/spacecraft/1957-001B> foaf:name ?name.

shared innovation

#All parts of a triple pattern can be variables


?spacecraft foaf:name ?name.

shared innovation

#Matching labels of resources


?subject rdfs:label ?label.

shared innovation

#Combine triples patterns to create a graph pattern


?subject rdfs:label ?label. ?subject rdf:type space:Discipline.

shared innovation

#SPARQL is based on Turtle, which allows abbreviations #e.g. predicate-object lists:


?subject rdfs:label ?label; rdf:type space:Discipline.

shared innovation

#Graph patterns allow us to traverse a graph


?spacecraft foaf:name Sputnik 1. ?launch space:spacecraft ?launch.

?launch space:launched ?launchdate.

shared innovation

#Graph patterns allow us to traverse a graph


?spacecraft foaf:name Sputnik 1. ?launch space:spacecraft ?launch.

?launch space:launched ?launchdate.

shared innovation

Structure of a Query
What does a basic SPARQL query look like?

shared innovation

#Ex. 1 #Associate URIs with prefixes PREFIX space: <http://purl.org/net/schemas/space/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> #Example of a SELECT query, retrieving 2 variables #Variables selected MUST be bound in graph pattern SELECT ?subject ?label WHERE { #This is our graph pattern ?subject rdfs:label ?label; rdf:type space:Discipline. }

shared innovation

#Ex. 2 PREFIX space: <http://purl.org/net/schemas/space/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> #Example of a SELECT query, retrieving all variables SELECT * WHERE { ?subject rdfs:label ?label; rdf:type space:Discipline. }

shared innovation

OPTIONAL bindings
How do we allow for missing or unknown information?

shared innovation

#Ex. 3 PREFIX space: <http://purl.org/net/schemas/space/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?image WHERE { #This pattern must be bound ?spacecraft foaf:name ?name. #Anything in this block doesn't have to be bound OPTIONAL { ?spacecraft foaf:depiction ?image. } }

shared innovation

UNION queries
How do we allow for alternatives or variations in the graph?

shared innovation

#Ex. 4 PREFIX space: <http://purl.org/net/schemas/space/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?subject ?displayLabel WHERE { { ?subject foaf:name ?displayLabel. } UNION { ?subject rdfs:label ?displayLabel. } }

shared innovation

Sorting & Restrictions


How do we apply a sort order to the results? How can we restrict the number of results returned?

shared innovation

#Ex.5 #Select the uri and the mass of all the spacecraft PREFIX space: <http://purl.org/net/schemas/space/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?spacecraft ?mass WHERE { ?spacecraft space:mass ?mass. }

shared innovation

#Ex. 6 #Select the uri and the mass of all the spacecraft #with highest first PREFIX space: <http://purl.org/net/schemas/space/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?spacecraft ?mass WHERE { ?spacecraft space:mass ?mass. } #Use an ORDER BY clause to apply a sort. Can be ASC or DESC ORDER BY DESC(?mass)

shared innovation

#Ex. 7 #Select the uri and the mass of the 10 heaviest spacecraft PREFIX space: <http://purl.org/net/schemas/space/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?spacecraft ?mass WHERE { ?spacecraft space:mass ?mass. } #Order by weight descending ORDER BY DESC(?mass) #Limit to first ten results LIMIT 10

shared innovation

#Ex. 8 #Select the uri and the mass of the 11-20th most #heaviest spacecraft PREFIX space: <http://purl.org/net/schemas/space/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?spacecraft ?mass WHERE { ?spacecraft space:mass ?mass. } ORDER BY DESC(?mass) #Limit to ten results LIMIT 10 #Apply an offset to get next page OFFSET 10

shared innovation

Filtering

How do we restrict results based on aspects of the data rather than the graph, e.g. string matching?

shared innovation

#Sample data for Sputnik launch


<http://purl.org/net/schemas/space/launch/1957-001> rdf:type space:Launch; #Assign a datatype to the literal, to indicate it is #a date space:launched "1957-10-04"^^xsd:date; space:spacecraft <http://purl.org/net/schemas/space/spacecraft/1957-001B> .

shared innovation

#Ex. 9 #Select name of spacecraft launched between #1st Jan 1969 and 1st Jan 1970
PREFIX space: <http://purl.org/net/schemas/space/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> SELECT ?name WHERE { ?launch space:launched ?date; space:spacecraft ?spacecraft. ?spacecraft foaf:name ?name. FILTER (?date > "1969-01-01"^^xsd:date && ?date < "1970-01-01"^^xsd:date)

shared innovation

#Ex. 10 #Select spacecraft with a mass of less than 90kg


PREFIX space: <http://purl.org/net/schemas/space/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> SELECT ?spacecraft ?name WHERE { ?spacecraft foaf:name ?name; space:mass ?mass. #Note that we have to cast the data to the right type #As it is not declared in the data FILTER( xsd:double(?mass) < 90.0 )

shared innovation

#Ex. 11 #Select spacecraft with a name like ollo


PREFIX space: <http://purl.org/net/schemas/space/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> SELECT ?name WHERE { ?spacecraft foaf:name ?name. }

FILTER( regex(?name, ollo, i ) )

shared innovation

Built-In Filters
Logical: !, &&, || Math: +, -, *, / Comparison: =, !=, >, <, ... SPARQL tests: isURI, isBlank, isLiteral, bound SPARQL accessors: str, lang, datatype Other: sameTerm, langMatches, regex

shared innovation

DISTINCT

How do we remove duplicate results?

shared innovation

#Ex. 12 #Select list of agencies associated with spacecraft


PREFIX space: <http://purl.org/net/schemas/space/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> SELECT DISTINCT ?agency WHERE { ?spacecraft space:agency ?agency.

shared innovation

SPARQL Query Forms


Does SPARQL do more than just SELECT data?

shared innovation

ASK
Test whether the graph contains some data of interest

shared innovation

#Ex. 13
#Was there a launch on 16th July 1969? PREFIX space: <http://purl.org/net/schemas/space/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> ASK WHERE { ?launch space:launched "1969-07-16"^^xsd:date. }

shared innovation

DESCRIBE
Generate an RDF description of a resource(s)

shared innovation

#Ex. 14
#Describe launch(es) that occurred on 16th July 1969 PREFIX space: <http://purl.org/net/schemas/space/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> DESCRIBE ?launch WHERE { ?launch space:launched "1969-07-16"^^xsd:date. }

shared innovation

#Ex. 15
#Describe spacecraft launched on 16th July 1969 PREFIX space: <http://purl.org/net/schemas/space/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> DESCRIBE ?spacecraft WHERE { ?launch space:launched "1969-07-16"^^xsd:date. ?spacecraft space:launch ?launch. }

shared innovation

CONSTRUCT

Create a custom RDF graph based on query criteria Can be used to transform RDF data

shared innovation

#Ex. 16

PREFIX space: <http://purl.org/net/schemas/space/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX foaf: <http://xmlns.com/foaf/0.1/>


CONSTRUCT { ?spacecraft foaf:name ?name; space:agency ?agency; space:mass ?mass. } WHERE { ?launch space:launched "1969-07-16"^^xsd:date. ?spacecraft space:launch ?launch; foaf:name ?name; space:agency ?agency; space:mass ?mass. }

shared innovation

SELECT

SQL style result set retrieval

shared innovation

#Ex. 17

PREFIX space: <http://purl.org/net/schemas/space/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX foaf: <http://xmlns.com/foaf/0.1/>


SELECT ?name ?agency ?mass WHERE { ?launch space:launched "1969-07-16"^^xsd:date. ?spacecraft space:launch ?launch; foaf:name ?name; space:agency ?agency; space:mass ?mass. }

shared innovation

Useful Links
SPARQL FAQ
http://www.thefigtrees.net/lee/sw/sparql-faq

SPARQL Recipes
http://n2.talis.com/wiki/SPARQL_Recipes

SPARQL By Example Tutorial


http://www.cambridgesemantics.com/2008/09/sparql-by-example

Twinkle, GUI SPARQL editor


http://www.ldodds.com/projects/twinkle http://code.google.com/p/twinkle-sparql-tools

shared innovation

You might also like