Programming with rich data frequently implies that one
needs to search for, understand, integrate and program with
new data - with each of these steps constituting a major
obstacle to successful data use.
In this talk we will explain and demonstrate how our approach,
LITEQ - Language Integrated Types, Extensions and Queries for
RDF Graphs, which is realized as part of the F# / Visual Studio-
environment, supports the software developer. Using the extended
IDE the developer may now
a. explore new, previously unseen data sources,
which are either natively in RDF or mapped into RDF;
b. use the exploration of schemata and data in order to
construct types and objects in the F# environment;
c. automatically map between data and programming language objects in
order to make them persistent in the data source;
d. have extended typing functionality added to the F#
environment and resulting from the exploration of the data source
and its mapping into F#.
Core to this approach is the novel node path query language, NPQL,
that allows for interactive, intuitive exploration of data schemata and
data proper as well as for the mapping and definition
of types, object collections and individual objects.
Beyond the existing type provider mechanism for F#
our approach also allows for property-based navigation
and runtime querying for data objects.
Injustice - Developers Among Us (SciFiDevCon 2024)
Information-Rich Programming in F# with Semantic Data
1. Web Science & Technologies
University of Koblenz ▪ Landau, Germany
Information-Rich Programming
in F#
with Semantic Data
2. Linked Open Data Cloud
Where’s the Data in
the Big Data Wave?
Gerhard Weikum
SIGMOD Blog, 6.3.2013
http://wp.sigmod.org/
… the Web of Linked Data consisting of
more than 30 Billion RDF triples from
hundreds of data sources …
WeST
Steffen Staab
staab@uni-koblenz.de
2
3. Some „Bubbles“ of the LOD Cloud
WeST
Steffen Staab
staab@uni-koblenz.de
3
5. Example RDF Graph
Native Graph
OR
R2RML: RDB to RDF Mapping Language
(W3C rec)
WeST
Steffen Staab
staab@uni-koblenz.de
5
6. Agenda
LiteQ – Language integrated types,
extensions and queries for RDF graphs
Exploring
Programming, Typing
Evaluation of LITEQ (NPQL) against SPARQL
Understandability
Ease of use
SchemEX
Construction of schema-based index
Schema induction
WeST
Steffen Staab
staab@uni-koblenz.de
6
7. Programming against unknown data source
Exploring a
data source
WeST
Steffen Staab
staab@uni-koblenz.de
Using a data
source
7
8. Example application
• Goal: Application that helps to collect dog license fee
• Send Email reminders to dog owners
• Data is given as RDF graph
WeST
Steffen Staab
staab@uni-koblenz.de
8
9. Programmer‘s Task 1: Schema Exploration
Schema exploration & Identification of important RDF types
• Find RDF types representing dogs and persons
WeST
Steffen Staab
staab@uni-koblenz.de
9
10. Naive Approach Task 1: Schema Exploration
Schema exploration & Identification of important RDF types
• Find RDF types representing dogs and persons
Tooling for Naïve Approach: SPARQL Query Formulation
WeST
Steffen Staab
staab@uni-koblenz.de
10
11. Programmer‘s Task 2: Code Type Creation
Code type creation in host language
• Convert the identified dog and person RDF types to
code types in the host language
type exCreature(uri) = class
member this.hasName : String = …
Member this.hasAge : int = …
end
type exDog(uri) = class
inherit exCreature(uri)
member this.hasOwner : exPerson = …
member this.TaxNo : Integer = …
end
type exPerson(uri) = class
inherit exCreature(uri)
end
WeST
Steffen Staab
staab@uni-koblenz.de
11
12. Programmer‘s Task 3: Data querying
Data querying
• Write a query that returns all dog owners
WeST
Steffen Staab
staab@uni-koblenz.de
12
13. Naive Approach Task 3: Data querying
Data querying
• Write a query that returns all dog owners
Tooling for Naive Approach: SPARQL Query formulation
WeST
Steffen Staab
staab@uni-koblenz.de
13
14. Naive Approach Task 4: Object manipulation
Create the objects, manipulate them & make them persistent
• Develop functionality around query to send reminder
let queryString = “SELECT ?owner WHERE {
?dog rdf:type exDog.
?dog ex:hasOwner ?owner
}“
dbConnection.evaluate(queryString) |> Seq.iter ( fun uri ->
let p = new Person(uri)
sendReminderEmail(p)
)
WeST
Steffen Staab
staab@uni-koblenz.de
14
16. Node Path Query Language
WeST
Steffen Staab
staab@uni-koblenz.de
16
17. Graph Traversal with NPQL: Subtype Navigation >
NPQL
rdf:Resource > ex:Creature
WeST
Steffen Staab
staab@uni-koblenz.de
17
18. Graph Traversal with NPQL: Property Navigation .
NPQL
ex:Dog . ex:hasOwner
WeST
Steffen Staab
staab@uni-koblenz.de
18
19. Extensional Semantics: Task 3 – Querying for Owners
NPQL
rdf:Resource > ex:Dog
ex:Creature > ex:Dog . ex:hasOwner
-> Extension
• Select ex:Dog
• Walk through
ex:hasOwner to
ex:Person
• Use extension to
retrieve all persons
who own dogs:
ex:Bob
WeST
Steffen Staab
staab@uni-koblenz.de
19
20. Intensional Semantics: Task 2 - Creating Person Code Type
NPQL
rdf:Resource > ex:Creature > ex:Dog.hasOwner ->
Intension
• Select ex:Person node
• “Intension”
to get code type
based on rdf type
type exCreature(uri) = class
member this.hasName : String = …
Member this.hasAge : int = …
end
type exPerson(uri) = class
inherit exCreature(uri)
WeST
Steffen Staab
end
staab@uni-koblenz.de
20
21. Autocompletion Semantics: Task 1 - Exploration
NPQL
rdf:Resource > ex:Creature >
Suggestions during query writing
• Instances based on
extensional semantics
• Types & Props
based on intensional
semantics
ex:Person, ex:Dog
WeST
Steffen Staab
staab@uni-koblenz.de
21
22. Extensional Semantics: LA Conjunctive Queries
NPQL
ex:Dog <- ex:hasOwner
Left associative
conjunctive query
with projection
WeST
Steffen Staab
staab@uni-koblenz.de
22
23. Host Language Extension: Task 4 – Create Objects
Create the objects, manipulation & persistence
• Develop the functionality around the query
that will send the reminder using LITEQ in F#
Preliminary Implementation in F#
http://west.uni-koblenz.de/Research/systems/liteq
WeST
Steffen Staab
staab@uni-koblenz.de
23
24. Web Science & Technologies
University of Koblenz ▪ Landau, Germany
Live demo of LITEQ in Visual Studio/F#
25. Related Work
Task
LINQ
XML Freebase
Type
Type
Provider Provider
LITEQ
current
version
LITEQ
Concept
1 Schema
exploration
-
(✔)
per doc
(✔)
only trees
✔
✔
2 Code type
creation
-
(✔)
erased
types?
(✔)
erased types
(✔)
erased types
✔
full
hierarchy
✔
-
((✔))
very limited
expressiv.
(✔)
limited
expressiv.
✔
no full
SPARQL
(✔)
✔
-
✔
no new object
creation
✔
3 Data
querying
4 Object
manipulation
& persistence
WeST
Steffen Staab
staab@uni-koblenz.de
26
26. Future work wrt LITEQ
• Current implementation is a prototype
• Current implementation uses erased types
At runtime, no type hierarchy is present
• Switch to generated types in the future
Higher expressiveness in the host language
exploiting type hierarchy
• Optimizations of LITEQ implementation necessary
• Lazy evaluation
• Distinguish between design time and runtime
• Not all types created at design time are needed at
runtime
• Formalize query language and investigate expressiveness
WeST
Steffen Staab
staab@uni-koblenz.de
27
27. Challenge: Joint Type Inference
Data modeling world
Description Logics
Program modeling world
ML type inference
RDF
UML class
diagrams
WeST
Steffen Staab
staab@uni-koblenz.de
28
28. Agenda
LiteQ – Language integrated types, extensions
and queries for RDF graphs
Exploring
Programming, Typing
Evaluation of LITEQ (NPQL) vs. SPARQL
Understandability
Ease of use
SchemEX
Where do I find relevant data?
Efficient construction of a
schema-level index
WeST
Steffen Staab
staab@uni-koblenz.de
29
29. Preliminary Evaluation of LITEQ/NPQL
Focused on NPQL
• Reason:
Test subjects lacked knowledge of F# and functional
programming for evaluating LITEQ in full
• Comparing NPQL against SPARQL
Main Hypothesis of Evaluation
• NPQL with autocompletion allows for effective query
writing in more efficient manner than SPARQL
Thus: some of the advantages of LITEQ cannot show up in
the evaluation!
WeST
Steffen Staab
staab@uni-koblenz.de
30
30. Evaluation Subjects
Evaluation with 11 participants
• 1 subject a posteriori eliminated from analysis of evaluation,
because he could not deal with SPARQL at all!
• 10 subjects remaining for analysis
Participants
• Undergraduate students
• PhD students
• PostDocs
WeST
Steffen Staab
staab@uni-koblenz.de
31
31. Evaluation - Setup
1. Pre-questionaire
1. Training in RDF, SPARQL & NPQL
1. Experimental tasks to be solved by subjects
1. Post-questionaire
WeST
Steffen Staab
staab@uni-koblenz.de
32
33. Phase 2: Training in RDF, SPARQL, NPQL
Training in RDF & SPARQL
• Presentation of RDF & SPARQL (20 minutes)
• Practical excercise writing SPARQL queries
in the Web interface (5 minutes)
Training in NPQL
• Practical excercise writing NPQL queries in Visual Studio
(5 minutes)
WeST
Steffen Staab
staab@uni-koblenz.de
34
34. Phase 3: Solving experimental tasks by subjects
9 different experimental tasks to solve
• Half of tasks in NPQL using Visual Studio
• Other half using SPARQL and a web interface
Task types
• Navigation and exploration of a data source (Task 1)
• Retrieving and answering questions about the data (Task 3)
• 2 tasks were not solvable in NPQL
• Investigating how users deal with limits of the language
Evaluation measure:
•
Duration to complete each task
WeST
Steffen Staab
staab@uni-koblenz.de
35
37. Phase 4: Post-Questionnaire
“Do you want to explore a data source in your IDE?”
4 yes”
3 no, prefer separation of steps”
3 no preference”
“NPQL is easier to use than SPARQL”
7 agree” or above
My conclusion
Other Though LITEQ is still in a pre-alpha status,
• Better supportadvantages queries in SPARQL
when writing became visible
in times for interactive working with
• Better responsepreliminary user evaluation NPQL
WeST
Steffen Staab
staab@uni-koblenz.de
38
38. Agenda
LiteQ – Language integrated types, extensions
and queries for RDF graphs
Exploring
Programming, Typing
Evaluation of LITEQ (NPQL) against SPARQL
Understandability
Ease of use
SchemEX
Construction of schema-based index
Schema induction
WeST
Steffen Staab
staab@uni-koblenz.de
39
39. Searching the LOD cloud
SELECT ?x
foaf:Document
WHERE {
?x rdf:type foaf:Document .
?x rdf:type swrc:InProceedings .
?x dc:creator ?y .
x
?y rdf:type fb:Computer_Scientist
}
?
WeST
Steffen Staab
staab@uni-koblenz.de
40
swrc:InProceedings
fb:Computer_Scientist
dc:creator
40. Searching the LOD cloud
SELECT ?x
WHERE {
?x rdf:type foaf:Document .
?x rdf:type swrc:InProceedings .
?x dc:creator ?y .
?y rdf:type fb:Computer_Scientist
}
Index
WeST
Steffen Staab
staab@uni-koblenz.de
41
• ACM
• DBLP
41. Schema-level index
Schema information on LOD
Explicit
Implicit
Assigning class types
Modelling attributes
Class
rdf:type
Property
Entity 2
Entity
Entity
WeST
Steffen Staab
staab@uni-koblenz.de
42
45. Bi-Simulation
Entities are equivalent, if they refer with the same
attributes to equivalent entities
Restriction: 1-Bi-Simulation
P1
P2
...
Pn
...
DSm
BSi
DS1
WeST
Steffen Staab
staab@uni-koblenz.de
DS2
46
51. Quality of Approximated Index
Stream-based computation vs. brute force
Data set of 11 Mio. tripel
WeST
Steffen Staab
staab@uni-koblenz.de
52
52. SchemEX @ BTC 2011
SchemEX
Allows complex queries (Star, Chain)
Scalable computation
High quality
Index over BTC 2011 data
2.17 billion tripel
Index: 55 million tripel
Commodity hardware
VM: 1 Core, 4 GB RAM
Throughput: 39.500 tripel / second
Computation of full index: 15h
WeST
Steffen Staab
staab@uni-koblenz.de
53
53. Future work wrt SchemEX
Further exploration of
• schema induction
• query federation
Federation vs Link Traversal based query execution
• Granularity of query execution
• Too fine grained: URI dereferencing
• Too expressive: SPARQL
• Sweet spot -> NPQL??
WeST
Steffen Staab
staab@uni-koblenz.de
54
54. Agenda
LiteQ – Language integrated types, extensions
and queries for RDF graphs
Exploring
Programming, Typing
Evaluation of LITEQ (NPQL) against SPARQL
Understandability
Ease of use
SchemEX
Construction of schema-based index
Schema induction
WeST
Steffen Staab
staab@uni-koblenz.de
55
55. Future
1.
2.
3.
4.
Searching for distributed data
Understanding distributed data
Intelligent queries on distributed data
Programming with distributed data
• Type reuse
• Type induction
WeST
Steffen Staab
staab@uni-koblenz.de
56
56. Web Science & Technologies
University of Koblenz ▪ Landau, Germany
Thank you for your attention!