1. Querying
Cultural Heritage Data
Dr. Barry Norton,
Development Manager, ResearchSpace*
* Funded by the Andrew W. Mellon Foundation
* Hosted by the Curatorial Directorate, British Museum
2. Statements and Patterns
• For one edge in a graph:
crm:P52_has_current_owner
bm-obj:EOC3130
bm-id:the-british-museum
3. Statements and Patterns
• For one edge in a graph:
crm:P52_has_current_owner
bm-obj:EOC3130
bm-id:the-british-museum
• We can declare/retrieve one (N)Triple:
4. Statements and Patterns
• For one edge in a graph:
crm:P52_has_current_owner
bm-obj:EOC3130
bm-id:the-british-museum
• We can declare/retrieve one (N)Triple:
• Or write this in Turtle:
@prefix crm: <http://erlangen-crm.org/current/> .
@prefix bm-obj: <http://collection.britishmuseum.org/id/object/> .
@prefix bm-id: <http://collection.britishmuseum.org/id/> .
bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum .
5. Statements and Patterns
• For one edge in a graph:
crm:P52_has_current_owner
bm-obj:EOC3130
bm-id:the-british-museum
• We can write this in Turtle:
• And check for it in SPARQL:
bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum .
PREFIX crm: <http://erlangen-crm.org/current/>
PREFIX bm-obj: <http://collection.britishmuseum.org/id/object/>
PREFIX bm-id: <http://collection.britishmuseum.org/id/>
ASK {bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum}
true
6. Statements and Patterns
• For a set of edges:
bm-obj:EOC3130
bm-id:the-british-museum
?
crm:P51_has_former_or_current_owner
?
• We can do the work on the client:
• Or have the server do it by turning the
triple into a triple pattern:
bm-obj:EOC3130 crm:P51_has_former_or_current_owner ?owner
7. Exercise
?
Questions:
• Why is the answer different?
• Who are the two (other) one-time owners?
?
8. Solutions & Exercises
• Why is the answer different?
– Reasoning, part of the work by the server
(being a triplestore) means that if two things
are related by crm:P52_has_current_owner
then they’’re related by
crm:P51_has_former_or_current_owner
• This is part of the work that the server
(triplestore) can do for you
• Exercise: query for the (strictly) former
owners… ?
?
11. Solutions & Exercises
Who are the two (other) one-time owners?
• Since people and institutions (and places) are
?
?
treated as are concepts, the names of the former
owners are attached using skos:prefLabel
• Exercise: if you didn’t already, include the
names in your query results
12. Solutions & Exercises
If you didn’t already, include the names in
your query results:
Question:
Why are we back at two answers?
13. Answer
• Answer:
– Just as we can add triples together to make a
graph in RDF, so we can add triple patterns
together in SPARQL to make a graph pattern
– By default all triple patterns must be matched,
but we can use the OPTIONAL {} pattern to
allow variation
• Exercise:
– Query for the owners and their names, if they
exist*
* N.B. this bug in the BM data will be fixed soon
15. Exercise
• Take a look here:
• Exercise: copy and run this query
16. CSV Exercise
• Type:
• Observe that one can now paste the query
including line breaks*
• Type:
* N.B. for now you should first replace the "s with 's and
change the one occurrence of ecrm: with crm: - we’ll fix this
* N.B. currently the query needs to be simplified as the BBC
data is not loaded – this will be available soon
17. Data Analysis
• One can import this CSV file into many
tools:
– A spreadsheet can be a good way to carry out
basic visualisations
– A scripting environment like (i)python/scipy or
R can allow more analysis before
visualisation, but:
• both languages also have libraries to encapsulate
interaction via SPARQL (rdflib/sparqlwrapper and
SPARQL/RCurl respectively)
• one should decide whether more analysis should
first be carried out using SPARQL…
18. Exercise
• If you haven’t so far, click on one of the
(HotW) 100 Objects (such as number 70,
Hoa Hakananai'a Easter Island Statue)
having run the main query
• Choose a material and observe the query
for other objects in this material
• Adapt this query to count how many BM
objects are made from basalt
19. Solution & Exercise
• Exercise: Now count the ‘top ten’ materials
and the number of objects for each
21. A Last Word
• SPARQLing a ‘native RDF’ database
(often called a ‘triplestore’) is not the only
option before defaulting to programming
• A ‘native graph’ database indexes the
graph in a different way, supporting
traversal-oriented queries