SlideShare uma empresa Scribd logo
1 de 69
Pragmatic Semantics for the
Web of Data
AImWD -- Montpellier 2013
Stefan Schlobach
(based on work of and using slides from Christophe
Gueret, Kathrin Denthler and Wouter Beek)
VU Amsterdam
Postulates
• The Web of Data requires semantics
• The Web of Data is not a database
• The Web of Data is a complex system
• Semantics for a database are not (always)
suitable for complex systems
• We need new semantic paradigms
– Voila: Pragmatic Semantics
CLASSICAL SEMANTICS FOR THE
WEB OF DATA
Part1
4/18
Linked Data
Graph/facts based knowledge representation
Connect resources to properties / other
resources
Web-based: resources have a URI
Try http://dbpedia.org/resource/Amsterdam
Model theory for Semantic Web
Languages: RDF, RDFS, OWL
• Ontology and Data: set of formulas S
• Model: formal structure satisfying all formulas
in S
• Entailment: formula f entailed by S iff f in true
in all models of S
• If contradiction, no models…
• No models, everything is entailed.
THE WEB OF DATA AS A
COMPLEX SYSTEM
Part2
Since 2006, people are creating linked data
But publication and interpretation
are distributed processes.
The Web of Data is a Complex System.
Not a database.
It is a Marketplace of ideas.
13/27
Key observations
The Web of Data is more than the sum of
its triples – it's a Complex System
Different actors
Different scales
Dynamic
October 2007
Evolution of the Web of Data
Now
The WoD is a complex system!
• Countless extremely heterogeneous datasets
o general-purposed datasets, such as DBpedia
o domain-oriented datasets, such as Bio2RDF
o government data, music data, geological data, social
network data, etc.
Hundrets of billions of RDF triples
o Billions of links within the datasets
o More than Million links between the datasets
Embedded rich semantics in the data
o data points are typed
o links are typed
o links is what makes the statements useful
Information has impact on different scales
A new way of seeing the WoD
Consider the WoD as network
Relevant (Network) Properties of WoD
• Average path length
• Degree distribution
• Strongly connected components
• Degree centrality
• Between centrality
• Closeness centrality
Scales of observation of the WoD
1. Graphs scale
Graph-scale WoD network
• Each dataset is a node
• Edges are weighted, directed connections
between the datasets
o if there is at least one triple having a subject
within dataset 1 and an object within dataset
2, then there is an edge between these two
datasets.
o the number of such triples is the weight of
the edge.
• 110 nodes with 350 edges
• Average path length is 2.16
• 50 components
The degree of 7 is critical
point after which the
network is not scale-free
any more.
Top central nodes
Node Value
DBpedia 0.332
DBLP Berlin 0.108
DBLP (RKB) 0.100
DBLP Hannover 0.097
FOAF profiles 0.075
Betweenness centrality
Node Value
DBpedia 0.762
Geonames 0.614
Drug Bank 0.576
Linked MDB 0.544
Flickr wrappr 0.526
Closeness centrality
Node Value
DBpedia 0.505
UniProt 0.266
DBLP (RKB) 0.266
ACM (RKB) 0.229
GeneID 0.211
Degree centrality
Every centrality has a specific meaning...
Scales of observation of the WoD
2. Triple scale
Triple-scale WoD network
• We took the 10 million triples from the dataset crawled
from the WoD, provided by the billion triple challenge
2009
• This "BTC" network is defined as G=(V, (E, L)), where
o V is a set of nodes, and each node is a URI or a
literal
o E is a set of edges
o L is a set of labels, each label characterising a
relation between nodes
• We applied a few strategies to aggregate data for
comparison.
Network Nodes Eges
Average path
length
Components
BTC 605K 860K 2.15 602K
BTC aggregated 14K 31K 2.80 7K
BTC aggregated +
filter
37 91 1.88 17
Triple-scale network and its aggregations
• BTC aggregated: triples are aggregated by the
domain names
• BTC aggregated + filter: only domain names
shared with the graph-scale network
Degree distribution
BTC BTC aggregated
Power-law distribution
Monitoring and Improving the WoD
• Linked data is meant to be browsed, jumping from one
resource to another
• The presence of Hubs is critical for the paths
• Create alternate paths to be used in case of failure
Guéret, Groth, van Harmelen, Schlobach, "Finding the Achilles Heel of the Web of Data:
using network analysis for link-recommendation”
Amsterdam
The Netherlands
isLocatedIn
Christophe VU Amsterdam
workIn
isLocatedIn
workIn
workIn
The links have explicit semantics, which brings implicit
links deduced after the reasoning process
Challenges:
Challenges:
• Multi-relations links
• FOAF (social networks + personal information)
• SIOC (relations characterising blogs)
• SWRC (describing research work)
• …
Different filtering produce different networks
Centrality status of nodes changes w.r.t the networks
• Dynamics
• Data will be continuously added and linked.
FORMAL INTERACTIONS WITH THE
WEB OF DATA
Part3
32/18
Interacting with Linked Data
Common semantic paradigm
Common goals:
Completeness: all the answers
Soundness: only exact answers
33/18
When solutions do not (quite) fit the problem ...
Copyright: sfllaw (Flickr, image 222795669)
34/18
Motivation
In the context of Web data ?
Issues with scale
Issues with lack of consistency
Issues with contextualised views over the World
Revise the goals
As many answers as possible (or needed)
Answers as accurate as possible (or needed)
35/18
From logic to optimisation
Optimise towards the revised goals
Need methods that cope with
uncertainty, context, noise, scale, ...
Nature inspired methods for
interacting with complex systems
• Advantageous properties
– Adaptation
– Simplicity
– Interactivity: Anytime, user in the loop
– Scalability and robustness
– Good for dealing with dynamic information
• Studied for different interaction types
37/18
Answering queries over the data
Copyright: jepoirrier (Flickr, image 829293711)
38/18
The problem
Match a graph pattern to the data
Most common approach
Join partial results for each edge of the query
39/18
Solving approaches
Logic-based
Find all the answers matching all of the query pattern
Optimisation
Find answers matching as much of the query as possible
Important implications of the optimisation
Only some of the answers will be found
Some of the answers found will be partially true
Data Layer
SE1
Cache
?
?
SE2
SE3
candidate solutions Offspring
1
ERDF: An evolutionary algorithm under the hood
2
33
4
Query Results
Web of Data
Input
Set of property/value pairs
Data Layer
SE1
Cache
?
?
SE2
SE3
candidate solutions Offspring
1
ERDF: An evolutionary algorithm under the hood
2
33
4
Query Results
Web of Data
Initial Population
Randomly chosen to fit the
query graph
Data Layer
SE1
Cache
?
?
SE2
SE3
candidate solutions Offspring
1
ERDF: An evolutionary algorithm under the hood
2
33
4
Query Results
Web of Data
Determining
fitness by
querying the
Web of Data
Single assertions are
sent to SPARQL
endpoints
Data Layer
SE1
Cache
?
?
SE2
SE3
candidate solutions Offspring
1
ERDF: An evolutionary algorithm under the hood
2
33
4
Query Results
Web of Data
Selection
Fitness determines the best
candidate which is chosen as
parent of the next
generation
Create offspring
Loop:
Data Layer
SE1
Cache
?
?
SE2
SE3
candidate solutions Offspring
1
ERDF: An evolutionary algorithm under the hood
2
33
4
Query Results
Web of Data
Data Layer
SE1
Cache
?
?
SE2
SE3
candidate solutions Offspring
1
ERDF: An evolutionary algorithm under the hood
2
33
4
Query Results
Web of Data
 Scalable
 Lean
 Robust
 Anytime
 Approximate
Properties of eRDF
 Arbitrary SPARQL endpoints
 Join-free, so scaling to more
endpoints is comparably pain
free
48/18
Some results
Tested on queries with
varied complexity
Works best with more
complex queries
Find exact answers
when there are some
49/18
Finding implicit facts in the data
Copyright: givingnot@rocketmail.com (Flickr, image 6990161491)
50/18
The problem
Deduce new facts from others
Most common approach
Centralise all the facts, batch process deductions
51/18
Solving approaches
Logic-based
Find all the facts that can be derived from the data
Optimisation
Find as many facts as possible while preserving
consistency
Important implications of the optimisation
Only some of the facts will be found
Unstable content
53/18
An optimisation approach: Swarms
Swarm of micro-reasoners
Browse the graph, applying rules when possible
Deduced facts disappear after some time
Every author of a
paper is a person
Every person is
also an agent
54/18
Some results
If they stay, most of
the implicit facts
are derived
Ants need to follow
each other to deal
with precedence of
rules
Several ants per
rule are needed
Related findings and approaches
• Storage optimisation using swarms
(SwarmLinda from FU Berlin)
• Join optimisation with swarms
(RCQ-ACS Erasmus Rotterdam)
• Emergent Semantics
(eXascale Infolab Fribourg)
• Previous speaker (argumentation based
semantics)
The day Semantics died…. ?
AImWD -- Montpellier 2013
Stefan Schlobach
(based on work of and using slides from Christophe
Gueret, Kathrin Denthler and Wouter Beek)
VU Amsterdam
PRAGMATIC SEMANTICS FOR THE
WEB OF DATA
Part4
There is meaning in the structure
Requirements
• Standard languages
• Standard semantics still valid (for simple data)
• Integrate structural properties
– Popularity of nodes/triples
– “Distance” between triples
– Frequency of triples
Semantics not strict, but pragmatic
Intuitively: a statement twenty times made is more
true than a statement once made
Approach
• Entailment defined through optimality over
different (possibly competing) notions of truth
• Make as much information in the data explicit,
and turn it into first-class semantics citizens
(truth orderings)
• Pragmatic entailment is defined through multi-
objective optimisation.
• Interoperability is then achieved by enriching an
ontology with meta-information about semantic
orderings, as well as agreement on the weighting
of orderings.
Subset based truth orderings
– the size of the minimal entailing subontology
– ratio of sub-models in which a formula is satisfied
versus the total number of sub-models
– ratio between sub-ontologies of O in which a
formula holds holds versus the number of all sub-
ontologies
Truth based on part of the given information
Graph-based truth orderings
• A shortest path ordering (diameter of the
induced sub-graphs). Such a notion is a proxy for
confidence of derivation. A
• A random-walk distance or edge-weights, induce
orderings that are clustering-aware, with sub-
ontologies entailing a formula have more
cohesion than others.
• PageRank orderings can be used as proxies for
popularity
Truth given on the structure of given information
Pragmatic Entailment
• A pragmatic closure C for an ontology O and
orderings f1 to fn is then a set of formulas that
is Pareto-optimal w.r.t. the optimisation
problem max*f1 (C),…,fn (C)+.
PraSem
• Project title : Pragmatic Semantics for the Web of
Data
• Acronym: PraSem
• Runtime: Nov 2012-Oct 2016
• Main researcher: Wouter Beek
• People involved: Stefan Schlobach, Christophe
Gueret, Kathrin Denthler, Pepijn Kroes, Frank van
Harmelen, and hopefully more people soon.
Deal with Open World Assumption
June 3, 2013 IS: Web of Data 66
Deal with incompleteness
June 3, 2013 IS: Web of Data 67
Formalise approximations
June 3, 2013 IS: Web of Data 68
Take home message
• The Web of Data requires semantics
• The Web of Data is not a database
• The Web of Data is a complex system
• Semantics for a database are not (always)
suitable for complex systems
• We need new semantic paradigms
– Voila: Pragmatic Semantics

Mais conteúdo relacionado

Mais procurados

A Soft Set-based Co-occurrence for Clustering Web User Transactions
A Soft Set-based Co-occurrence for Clustering Web User TransactionsA Soft Set-based Co-occurrence for Clustering Web User Transactions
A Soft Set-based Co-occurrence for Clustering Web User TransactionsTELKOMNIKA JOURNAL
 
Paper id 37201536
Paper id 37201536Paper id 37201536
Paper id 37201536IJRAT
 
Semantic Web, Ontology, and Ontology Learning: Introduction
Semantic Web, Ontology, and Ontology Learning: IntroductionSemantic Web, Ontology, and Ontology Learning: Introduction
Semantic Web, Ontology, and Ontology Learning: IntroductionKent State University
 
ontology based- data_integration.
ontology based- data_integration.ontology based- data_integration.
ontology based- data_integration.AliAlJadaa
 
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...Punit Sharnagat
 
Ontology For Data Integration
Ontology For Data IntegrationOntology For Data Integration
Ontology For Data Integrationjuanesteva
 
Clustering Algorithm with a Novel Similarity Measure
Clustering Algorithm with a Novel Similarity MeasureClustering Algorithm with a Novel Similarity Measure
Clustering Algorithm with a Novel Similarity MeasureIOSR Journals
 
Recent Trends in Incremental Clustering: A Review
Recent Trends in Incremental Clustering: A ReviewRecent Trends in Incremental Clustering: A Review
Recent Trends in Incremental Clustering: A ReviewIOSRjournaljce
 
Semantic Annotation of Documents
Semantic Annotation of DocumentsSemantic Annotation of Documents
Semantic Annotation of Documentssubash chandra
 
Data Mining and the Web_Past_Present and Future
Data Mining and the Web_Past_Present and FutureData Mining and the Web_Past_Present and Future
Data Mining and the Web_Past_Present and Futurefeiwin
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstracttsysglobalsolutions
 
Weblog Extraction With Fuzzy Classification Methods
Weblog Extraction With Fuzzy Classification MethodsWeblog Extraction With Fuzzy Classification Methods
Weblog Extraction With Fuzzy Classification MethodsEdy Portmann
 
Ontology mapping for the semantic web
Ontology mapping for the semantic webOntology mapping for the semantic web
Ontology mapping for the semantic webWorawith Sangkatip
 
Hyponymy extraction of domain ontology
Hyponymy extraction of domain ontologyHyponymy extraction of domain ontology
Hyponymy extraction of domain ontologyIJwest
 
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect match
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect matchLinked Open (Geo)Data and the Distributed Ontology Language – a perfect match
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect matchChristoph Lange
 
Ethics & (Explainable) AI – Semantic AI & the Role of the Knowledge Scientist
Ethics & (Explainable) AI – Semantic AI & the Role of the Knowledge ScientistEthics & (Explainable) AI – Semantic AI & the Role of the Knowledge Scientist
Ethics & (Explainable) AI – Semantic AI & the Role of the Knowledge ScientistStratos Kontopoulos
 
A little more semantics goes a lot further!  Getting more out of Linked Data ...
A little more semantics goes a lot further!  Getting more out of Linked Data ...A little more semantics goes a lot further!  Getting more out of Linked Data ...
A little more semantics goes a lot further!  Getting more out of Linked Data ...Michel Dumontier
 

Mais procurados (20)

A Soft Set-based Co-occurrence for Clustering Web User Transactions
A Soft Set-based Co-occurrence for Clustering Web User TransactionsA Soft Set-based Co-occurrence for Clustering Web User Transactions
A Soft Set-based Co-occurrence for Clustering Web User Transactions
 
Paper id 37201536
Paper id 37201536Paper id 37201536
Paper id 37201536
 
Semantic Web, Ontology, and Ontology Learning: Introduction
Semantic Web, Ontology, and Ontology Learning: IntroductionSemantic Web, Ontology, and Ontology Learning: Introduction
Semantic Web, Ontology, and Ontology Learning: Introduction
 
ontology based- data_integration.
ontology based- data_integration.ontology based- data_integration.
ontology based- data_integration.
 
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...
 
Ontology For Data Integration
Ontology For Data IntegrationOntology For Data Integration
Ontology For Data Integration
 
Clustering Algorithm with a Novel Similarity Measure
Clustering Algorithm with a Novel Similarity MeasureClustering Algorithm with a Novel Similarity Measure
Clustering Algorithm with a Novel Similarity Measure
 
Recent Trends in Incremental Clustering: A Review
Recent Trends in Incremental Clustering: A ReviewRecent Trends in Incremental Clustering: A Review
Recent Trends in Incremental Clustering: A Review
 
Semantic Annotation of Documents
Semantic Annotation of DocumentsSemantic Annotation of Documents
Semantic Annotation of Documents
 
Data Mining and the Web_Past_Present and Future
Data Mining and the Web_Past_Present and FutureData Mining and the Web_Past_Present and Future
Data Mining and the Web_Past_Present and Future
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstract
 
Weblog Extraction With Fuzzy Classification Methods
Weblog Extraction With Fuzzy Classification MethodsWeblog Extraction With Fuzzy Classification Methods
Weblog Extraction With Fuzzy Classification Methods
 
Dagstuhl 2013 - Montali - On the Relationship between OBDA and Relational Map...
Dagstuhl 2013 - Montali - On the Relationship between OBDA and Relational Map...Dagstuhl 2013 - Montali - On the Relationship between OBDA and Relational Map...
Dagstuhl 2013 - Montali - On the Relationship between OBDA and Relational Map...
 
Ontology mapping for the semantic web
Ontology mapping for the semantic webOntology mapping for the semantic web
Ontology mapping for the semantic web
 
Hyponymy extraction of domain ontology
Hyponymy extraction of domain ontologyHyponymy extraction of domain ontology
Hyponymy extraction of domain ontology
 
IP address anonymization
IP address anonymizationIP address anonymization
IP address anonymization
 
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect match
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect matchLinked Open (Geo)Data and the Distributed Ontology Language – a perfect match
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect match
 
Ethics & (Explainable) AI – Semantic AI & the Role of the Knowledge Scientist
Ethics & (Explainable) AI – Semantic AI & the Role of the Knowledge ScientistEthics & (Explainable) AI – Semantic AI & the Role of the Knowledge Scientist
Ethics & (Explainable) AI – Semantic AI & the Role of the Knowledge Scientist
 
Towards Integrating Ontologies An EDM-Based Approach
Towards Integrating Ontologies An EDM-Based ApproachTowards Integrating Ontologies An EDM-Based Approach
Towards Integrating Ontologies An EDM-Based Approach
 
A little more semantics goes a lot further!  Getting more out of Linked Data ...
A little more semantics goes a lot further!  Getting more out of Linked Data ...A little more semantics goes a lot further!  Getting more out of Linked Data ...
A little more semantics goes a lot further!  Getting more out of Linked Data ...
 

Destaque

Knowledge Management @ SEMANTiCS
Knowledge Management @ SEMANTiCSKnowledge Management @ SEMANTiCS
Knowledge Management @ SEMANTiCSAndreas Matern
 
Extracting Meaning from Wikipedia
Extracting Meaning from WikipediaExtracting Meaning from Wikipedia
Extracting Meaning from WikipediaOfer Egozi
 
How Semantic Technologies Supercharge a Platform for Context-Aware Applications
How Semantic Technologies Supercharge a Platform for Context-Aware ApplicationsHow Semantic Technologies Supercharge a Platform for Context-Aware Applications
How Semantic Technologies Supercharge a Platform for Context-Aware ApplicationsDavid Damen
 
A Feature-Complete Petri Net Semantics for WS-BPEL 2.0
A Feature-Complete Petri Net Semantics for WS-BPEL 2.0A Feature-Complete Petri Net Semantics for WS-BPEL 2.0
A Feature-Complete Petri Net Semantics for WS-BPEL 2.0Universität Rostock
 
Building Executable Biological Pathway Models Automatically from BioPAX
Building Executable Biological Pathway Models Automatically from BioPAX  Building Executable Biological Pathway Models Automatically from BioPAX
Building Executable Biological Pathway Models Automatically from BioPAX Paul Groth
 
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionLinking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionRonald Ashri
 
Clinical Trials in Emerging Markets
Clinical Trials in Emerging MarketsClinical Trials in Emerging Markets
Clinical Trials in Emerging MarketsArena International
 
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting RatingsSemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings Matthew Rowe
 

Destaque (9)

Knowledge Management @ SEMANTiCS
Knowledge Management @ SEMANTiCSKnowledge Management @ SEMANTiCS
Knowledge Management @ SEMANTiCS
 
Extracting Meaning from Wikipedia
Extracting Meaning from WikipediaExtracting Meaning from Wikipedia
Extracting Meaning from Wikipedia
 
How Semantic Technologies Supercharge a Platform for Context-Aware Applications
How Semantic Technologies Supercharge a Platform for Context-Aware ApplicationsHow Semantic Technologies Supercharge a Platform for Context-Aware Applications
How Semantic Technologies Supercharge a Platform for Context-Aware Applications
 
Semantic Computing in Real-World: Vertical and Horizontal application
Semantic Computing in Real-World: Vertical and Horizontal applicationSemantic Computing in Real-World: Vertical and Horizontal application
Semantic Computing in Real-World: Vertical and Horizontal application
 
A Feature-Complete Petri Net Semantics for WS-BPEL 2.0
A Feature-Complete Petri Net Semantics for WS-BPEL 2.0A Feature-Complete Petri Net Semantics for WS-BPEL 2.0
A Feature-Complete Petri Net Semantics for WS-BPEL 2.0
 
Building Executable Biological Pathway Models Automatically from BioPAX
Building Executable Biological Pathway Models Automatically from BioPAX  Building Executable Biological Pathway Models Automatically from BioPAX
Building Executable Biological Pathway Models Automatically from BioPAX
 
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionLinking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
 
Clinical Trials in Emerging Markets
Clinical Trials in Emerging MarketsClinical Trials in Emerging Markets
Clinical Trials in Emerging Markets
 
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting RatingsSemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
 

Semelhante a Keynote at AImWD

Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...giuseppe_futia
 
The Future of Semantics on the Web
The Future of Semantics on the WebThe Future of Semantics on the Web
The Future of Semantics on the WebJohn Domingue
 
Reflections on Almost Two Decades of Research into Stream Processing
Reflections on Almost Two Decades of Research into Stream ProcessingReflections on Almost Two Decades of Research into Stream Processing
Reflections on Almost Two Decades of Research into Stream ProcessingKyumars Sheykh Esmaili
 
Detection of Related Semantic Datasets Based on Frequent Subgraph Mining
Detection of Related Semantic Datasets Based on Frequent Subgraph MiningDetection of Related Semantic Datasets Based on Frequent Subgraph Mining
Detection of Related Semantic Datasets Based on Frequent Subgraph MiningMikel Emaldi Manrique
 
Introduction: Linked Data and the Semantic Web
Introduction: Linked Data and the Semantic WebIntroduction: Linked Data and the Semantic Web
Introduction: Linked Data and the Semantic WebNikolaos Konstantinou
 
semantic integration.ppt
semantic integration.pptsemantic integration.ppt
semantic integration.pptNaglaaFathy42
 
2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 TutorialAlexander Pico
 
Relationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine LearningRelationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine LearningNeo4j
 
From Pipelines to Refineries: Scaling Big Data Applications
From Pipelines to Refineries: Scaling Big Data ApplicationsFrom Pipelines to Refineries: Scaling Big Data Applications
From Pipelines to Refineries: Scaling Big Data ApplicationsDatabricks
 
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...San Diego Supercomputer Center
 
Graph Analytics: Graph Algorithms Inside Neo4j
Graph Analytics: Graph Algorithms Inside Neo4jGraph Analytics: Graph Algorithms Inside Neo4j
Graph Analytics: Graph Algorithms Inside Neo4jNeo4j
 
network mining and representation learning
network mining and representation learningnetwork mining and representation learning
network mining and representation learningsun peiyuan
 
Azure Databricks for Data Scientists
Azure Databricks for Data ScientistsAzure Databricks for Data Scientists
Azure Databricks for Data ScientistsRichard Garris
 
Comparing Big Data and Simulation Applications and Implications for Software ...
Comparing Big Data and Simulation Applications and Implications for Software ...Comparing Big Data and Simulation Applications and Implications for Software ...
Comparing Big Data and Simulation Applications and Implications for Software ...Geoffrey Fox
 
EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014
EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014
EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014James Powell
 
03 interlinking-dass
03 interlinking-dass03 interlinking-dass
03 interlinking-dassDiego Pessoa
 
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...
NS-CUK Seminar: H.B.Kim,  Review on "metapath2vec: Scalable representation le...NS-CUK Seminar: H.B.Kim,  Review on "metapath2vec: Scalable representation le...
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...ssuser4b1f48
 

Semelhante a Keynote at AImWD (20)

Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
 
ECCS 2010
ECCS 2010ECCS 2010
ECCS 2010
 
The Future of Semantics on the Web
The Future of Semantics on the WebThe Future of Semantics on the Web
The Future of Semantics on the Web
 
Reflections on Almost Two Decades of Research into Stream Processing
Reflections on Almost Two Decades of Research into Stream ProcessingReflections on Almost Two Decades of Research into Stream Processing
Reflections on Almost Two Decades of Research into Stream Processing
 
Detection of Related Semantic Datasets Based on Frequent Subgraph Mining
Detection of Related Semantic Datasets Based on Frequent Subgraph MiningDetection of Related Semantic Datasets Based on Frequent Subgraph Mining
Detection of Related Semantic Datasets Based on Frequent Subgraph Mining
 
Introduction: Linked Data and the Semantic Web
Introduction: Linked Data and the Semantic WebIntroduction: Linked Data and the Semantic Web
Introduction: Linked Data and the Semantic Web
 
semantic integration.ppt
semantic integration.pptsemantic integration.ppt
semantic integration.ppt
 
2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial
 
Relationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine LearningRelationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine Learning
 
From Pipelines to Refineries: Scaling Big Data Applications
From Pipelines to Refineries: Scaling Big Data ApplicationsFrom Pipelines to Refineries: Scaling Big Data Applications
From Pipelines to Refineries: Scaling Big Data Applications
 
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
 
Graph Analytics: Graph Algorithms Inside Neo4j
Graph Analytics: Graph Algorithms Inside Neo4jGraph Analytics: Graph Algorithms Inside Neo4j
Graph Analytics: Graph Algorithms Inside Neo4j
 
network mining and representation learning
network mining and representation learningnetwork mining and representation learning
network mining and representation learning
 
Azure Databricks for Data Scientists
Azure Databricks for Data ScientistsAzure Databricks for Data Scientists
Azure Databricks for Data Scientists
 
Comparing Big Data and Simulation Applications and Implications for Software ...
Comparing Big Data and Simulation Applications and Implications for Software ...Comparing Big Data and Simulation Applications and Implications for Software ...
Comparing Big Data and Simulation Applications and Implications for Software ...
 
EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014
EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014
EgoSystem: Presentation to LITA, American Library Association, Nov 8 2014
 
03 interlinking-dass
03 interlinking-dass03 interlinking-dass
03 interlinking-dass
 
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...
NS-CUK Seminar: H.B.Kim,  Review on "metapath2vec: Scalable representation le...NS-CUK Seminar: H.B.Kim,  Review on "metapath2vec: Scalable representation le...
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...
 
Metadata Mapping & Crosswalks
Metadata Mapping & CrosswalksMetadata Mapping & Crosswalks
Metadata Mapping & Crosswalks
 
lec01.ppt
lec01.pptlec01.ppt
lec01.ppt
 

Último

Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEaurabinda banchhor
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...JojoEDelaCruz
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
The Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsThe Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsRommel Regala
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Presentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxPresentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxRosabel UA
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 

Último (20)

Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSE
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
The Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsThe Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World Politics
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Presentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxPresentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptx
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 

Keynote at AImWD

  • 1. Pragmatic Semantics for the Web of Data AImWD -- Montpellier 2013 Stefan Schlobach (based on work of and using slides from Christophe Gueret, Kathrin Denthler and Wouter Beek) VU Amsterdam
  • 2. Postulates • The Web of Data requires semantics • The Web of Data is not a database • The Web of Data is a complex system • Semantics for a database are not (always) suitable for complex systems • We need new semantic paradigms – Voila: Pragmatic Semantics
  • 3. CLASSICAL SEMANTICS FOR THE WEB OF DATA Part1
  • 4. 4/18 Linked Data Graph/facts based knowledge representation Connect resources to properties / other resources Web-based: resources have a URI Try http://dbpedia.org/resource/Amsterdam
  • 5. Model theory for Semantic Web Languages: RDF, RDFS, OWL • Ontology and Data: set of formulas S • Model: formal structure satisfying all formulas in S • Entailment: formula f entailed by S iff f in true in all models of S • If contradiction, no models… • No models, everything is entailed.
  • 6. THE WEB OF DATA AS A COMPLEX SYSTEM Part2
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12. Since 2006, people are creating linked data But publication and interpretation are distributed processes. The Web of Data is a Complex System. Not a database. It is a Marketplace of ideas.
  • 13. 13/27 Key observations The Web of Data is more than the sum of its triples – it's a Complex System Different actors Different scales Dynamic
  • 15. Evolution of the Web of Data Now
  • 16. The WoD is a complex system! • Countless extremely heterogeneous datasets o general-purposed datasets, such as DBpedia o domain-oriented datasets, such as Bio2RDF o government data, music data, geological data, social network data, etc. Hundrets of billions of RDF triples o Billions of links within the datasets o More than Million links between the datasets Embedded rich semantics in the data o data points are typed o links are typed o links is what makes the statements useful Information has impact on different scales
  • 17. A new way of seeing the WoD Consider the WoD as network
  • 18. Relevant (Network) Properties of WoD • Average path length • Degree distribution • Strongly connected components • Degree centrality • Between centrality • Closeness centrality
  • 19. Scales of observation of the WoD 1. Graphs scale
  • 20. Graph-scale WoD network • Each dataset is a node • Edges are weighted, directed connections between the datasets o if there is at least one triple having a subject within dataset 1 and an object within dataset 2, then there is an edge between these two datasets. o the number of such triples is the weight of the edge.
  • 21. • 110 nodes with 350 edges • Average path length is 2.16 • 50 components
  • 22. The degree of 7 is critical point after which the network is not scale-free any more.
  • 23. Top central nodes Node Value DBpedia 0.332 DBLP Berlin 0.108 DBLP (RKB) 0.100 DBLP Hannover 0.097 FOAF profiles 0.075 Betweenness centrality Node Value DBpedia 0.762 Geonames 0.614 Drug Bank 0.576 Linked MDB 0.544 Flickr wrappr 0.526 Closeness centrality Node Value DBpedia 0.505 UniProt 0.266 DBLP (RKB) 0.266 ACM (RKB) 0.229 GeneID 0.211 Degree centrality Every centrality has a specific meaning...
  • 24. Scales of observation of the WoD 2. Triple scale
  • 25. Triple-scale WoD network • We took the 10 million triples from the dataset crawled from the WoD, provided by the billion triple challenge 2009 • This "BTC" network is defined as G=(V, (E, L)), where o V is a set of nodes, and each node is a URI or a literal o E is a set of edges o L is a set of labels, each label characterising a relation between nodes • We applied a few strategies to aggregate data for comparison.
  • 26. Network Nodes Eges Average path length Components BTC 605K 860K 2.15 602K BTC aggregated 14K 31K 2.80 7K BTC aggregated + filter 37 91 1.88 17 Triple-scale network and its aggregations • BTC aggregated: triples are aggregated by the domain names • BTC aggregated + filter: only domain names shared with the graph-scale network
  • 27. Degree distribution BTC BTC aggregated Power-law distribution
  • 28. Monitoring and Improving the WoD • Linked data is meant to be browsed, jumping from one resource to another • The presence of Hubs is critical for the paths • Create alternate paths to be used in case of failure Guéret, Groth, van Harmelen, Schlobach, "Finding the Achilles Heel of the Web of Data: using network analysis for link-recommendation”
  • 29. Amsterdam The Netherlands isLocatedIn Christophe VU Amsterdam workIn isLocatedIn workIn workIn The links have explicit semantics, which brings implicit links deduced after the reasoning process Challenges:
  • 30. Challenges: • Multi-relations links • FOAF (social networks + personal information) • SIOC (relations characterising blogs) • SWRC (describing research work) • … Different filtering produce different networks Centrality status of nodes changes w.r.t the networks • Dynamics • Data will be continuously added and linked.
  • 31. FORMAL INTERACTIONS WITH THE WEB OF DATA Part3
  • 32. 32/18 Interacting with Linked Data Common semantic paradigm Common goals: Completeness: all the answers Soundness: only exact answers
  • 33. 33/18 When solutions do not (quite) fit the problem ... Copyright: sfllaw (Flickr, image 222795669)
  • 34. 34/18 Motivation In the context of Web data ? Issues with scale Issues with lack of consistency Issues with contextualised views over the World Revise the goals As many answers as possible (or needed) Answers as accurate as possible (or needed)
  • 35. 35/18 From logic to optimisation Optimise towards the revised goals Need methods that cope with uncertainty, context, noise, scale, ...
  • 36. Nature inspired methods for interacting with complex systems • Advantageous properties – Adaptation – Simplicity – Interactivity: Anytime, user in the loop – Scalability and robustness – Good for dealing with dynamic information • Studied for different interaction types
  • 37. 37/18 Answering queries over the data Copyright: jepoirrier (Flickr, image 829293711)
  • 38. 38/18 The problem Match a graph pattern to the data Most common approach Join partial results for each edge of the query
  • 39. 39/18 Solving approaches Logic-based Find all the answers matching all of the query pattern Optimisation Find answers matching as much of the query as possible Important implications of the optimisation Only some of the answers will be found Some of the answers found will be partially true
  • 40.
  • 41. Data Layer SE1 Cache ? ? SE2 SE3 candidate solutions Offspring 1 ERDF: An evolutionary algorithm under the hood 2 33 4 Query Results Web of Data Input Set of property/value pairs
  • 42. Data Layer SE1 Cache ? ? SE2 SE3 candidate solutions Offspring 1 ERDF: An evolutionary algorithm under the hood 2 33 4 Query Results Web of Data Initial Population Randomly chosen to fit the query graph
  • 43. Data Layer SE1 Cache ? ? SE2 SE3 candidate solutions Offspring 1 ERDF: An evolutionary algorithm under the hood 2 33 4 Query Results Web of Data Determining fitness by querying the Web of Data Single assertions are sent to SPARQL endpoints
  • 44. Data Layer SE1 Cache ? ? SE2 SE3 candidate solutions Offspring 1 ERDF: An evolutionary algorithm under the hood 2 33 4 Query Results Web of Data Selection Fitness determines the best candidate which is chosen as parent of the next generation Create offspring Loop:
  • 45. Data Layer SE1 Cache ? ? SE2 SE3 candidate solutions Offspring 1 ERDF: An evolutionary algorithm under the hood 2 33 4 Query Results Web of Data
  • 46. Data Layer SE1 Cache ? ? SE2 SE3 candidate solutions Offspring 1 ERDF: An evolutionary algorithm under the hood 2 33 4 Query Results Web of Data
  • 47.  Scalable  Lean  Robust  Anytime  Approximate Properties of eRDF  Arbitrary SPARQL endpoints  Join-free, so scaling to more endpoints is comparably pain free
  • 48. 48/18 Some results Tested on queries with varied complexity Works best with more complex queries Find exact answers when there are some
  • 49. 49/18 Finding implicit facts in the data Copyright: givingnot@rocketmail.com (Flickr, image 6990161491)
  • 50. 50/18 The problem Deduce new facts from others Most common approach Centralise all the facts, batch process deductions
  • 51. 51/18 Solving approaches Logic-based Find all the facts that can be derived from the data Optimisation Find as many facts as possible while preserving consistency Important implications of the optimisation Only some of the facts will be found Unstable content
  • 52.
  • 53. 53/18 An optimisation approach: Swarms Swarm of micro-reasoners Browse the graph, applying rules when possible Deduced facts disappear after some time Every author of a paper is a person Every person is also an agent
  • 54. 54/18 Some results If they stay, most of the implicit facts are derived Ants need to follow each other to deal with precedence of rules Several ants per rule are needed
  • 55. Related findings and approaches • Storage optimisation using swarms (SwarmLinda from FU Berlin) • Join optimisation with swarms (RCQ-ACS Erasmus Rotterdam) • Emergent Semantics (eXascale Infolab Fribourg) • Previous speaker (argumentation based semantics)
  • 56.
  • 57. The day Semantics died…. ? AImWD -- Montpellier 2013 Stefan Schlobach (based on work of and using slides from Christophe Gueret, Kathrin Denthler and Wouter Beek) VU Amsterdam
  • 58. PRAGMATIC SEMANTICS FOR THE WEB OF DATA Part4
  • 59. There is meaning in the structure
  • 60. Requirements • Standard languages • Standard semantics still valid (for simple data) • Integrate structural properties – Popularity of nodes/triples – “Distance” between triples – Frequency of triples Semantics not strict, but pragmatic Intuitively: a statement twenty times made is more true than a statement once made
  • 61. Approach • Entailment defined through optimality over different (possibly competing) notions of truth • Make as much information in the data explicit, and turn it into first-class semantics citizens (truth orderings) • Pragmatic entailment is defined through multi- objective optimisation. • Interoperability is then achieved by enriching an ontology with meta-information about semantic orderings, as well as agreement on the weighting of orderings.
  • 62. Subset based truth orderings – the size of the minimal entailing subontology – ratio of sub-models in which a formula is satisfied versus the total number of sub-models – ratio between sub-ontologies of O in which a formula holds holds versus the number of all sub- ontologies Truth based on part of the given information
  • 63. Graph-based truth orderings • A shortest path ordering (diameter of the induced sub-graphs). Such a notion is a proxy for confidence of derivation. A • A random-walk distance or edge-weights, induce orderings that are clustering-aware, with sub- ontologies entailing a formula have more cohesion than others. • PageRank orderings can be used as proxies for popularity Truth given on the structure of given information
  • 64. Pragmatic Entailment • A pragmatic closure C for an ontology O and orderings f1 to fn is then a set of formulas that is Pareto-optimal w.r.t. the optimisation problem max*f1 (C),…,fn (C)+.
  • 65. PraSem • Project title : Pragmatic Semantics for the Web of Data • Acronym: PraSem • Runtime: Nov 2012-Oct 2016 • Main researcher: Wouter Beek • People involved: Stefan Schlobach, Christophe Gueret, Kathrin Denthler, Pepijn Kroes, Frank van Harmelen, and hopefully more people soon.
  • 66. Deal with Open World Assumption June 3, 2013 IS: Web of Data 66
  • 67. Deal with incompleteness June 3, 2013 IS: Web of Data 67
  • 68. Formalise approximations June 3, 2013 IS: Web of Data 68
  • 69. Take home message • The Web of Data requires semantics • The Web of Data is not a database • The Web of Data is a complex system • Semantics for a database are not (always) suitable for complex systems • We need new semantic paradigms – Voila: Pragmatic Semantics