SlideShare uma empresa Scribd logo
1 de 43
Baixar para ler offline
Managing Completeness of Web Data
Fariz Darari
PhD Supervisor: Werner Nutt
Supported by the project MAGIC, funded by the province of Bolzano
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 1 / 38
About Us
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 2 / 38
Research Group
Sorted by distance to Werner’s office :)
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 3 / 38
Bozen-Bolzano
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 4 / 38
Motivation
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 5 / 38
Completeness statements are already there
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 6 / 38
However . . .
Completeness statements are available
but only in natural language
Unclear what data completeness & query completeness mean
No techniques to check whether data completeness entails
query completeness
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 7 / 38
Solution Ideas
Completeness statements are available
but only in natural language
Solution: RDF-ize completeness statements
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 8 / 38
Solution Ideas
Completeness statements are available
but only in natural language
Solution: RDF-ize completeness statements
Unclear what data completeness & query completeness mean
Solution: Formalize data completeness & query completeness
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 8 / 38
Solution Ideas
Completeness statements are available
but only in natural language
Solution: RDF-ize completeness statements
Unclear what data completeness & query completeness mean
Solution: Formalize data completeness & query completeness
No techniques to check whether data completeness entails
query completeness
Solution: Develop techniques to check whether data completeness
entails query completeness
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 8 / 38
Solutions
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 9 / 38
Background: RDF
Grd = { (resDogs, dir, tarantino),
(resDogs, act, tarantino) }
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 10 / 38
Background: SPARQL
SELECT
Qsdir = ({ ?m }, { (?m, dir, tarantino) })
ASK
Qadir = ({ }, { (?m, dir, tarantino) })
CONSTRUCT
Qcdir = ({ (?m, dir, tarantino) }, { (?m, dir, tarantino) })
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 11 / 38
Story: Incomplete Data Source
An incomplete data source of Reservoir Dogs,
Gdbp = (Ga
dbp, Gi
dbp):
Ga
dbp = {(resDogs, dir, tarantino)}
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 12 / 38
Story: Incomplete Data Source
An incomplete data source of Reservoir Dogs,
Gdbp = (Ga
dbp, Gi
dbp):
Gi
dbp = {(resDogs, dir, tarantino), (resDogs, act, tarantino)}
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 13 / 38
Story: Completeness Statement
Ga
dbp = {(resDogs, dir, tarantino)}
Gi
dbp = {(resDogs, dir, tarantino), (resDogs, act, tarantino)}
From (Ga
dbp, Gi
dbp), we can say that DBpedia is complete
for movies directed by Tarantino:
Cdir = Compl((?m, dir, tarantino) | ∅)
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 14 / 38
Story: Completeness Statement
Ga
dbp = {(resDogs, dir, tarantino)}
Gi
dbp = {(resDogs, dir, tarantino), (resDogs, act, tarantino)}
From (Ga
dbp, Gi
dbp), we can say that DBpedia is complete
for movies directed by Tarantino:
Cdir = Compl((?m, dir, tarantino) | ∅)
However, it is not complete for actors in movies directed by Tarantino:
Cact = Compl((?m, act, ?a) | (?m, dir, tarantino))
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 14 / 38
Story: Query Completeness
Ga
dbp = {(resDogs, dir, tarantino)}
Gi
dbp = {(resDogs, dir, tarantino), (resDogs, act, tarantino)}
Consequently, when we ask for all movies directed by Tarantino
over DBpedia:
Qdir = ({?m}, {(?m, dir, tarantino)})
the query completeness Compl(Qdir ) is obtained.
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 15 / 38
Story: Query Completeness
Ga
dbp = {(resDogs, dir, tarantino)}
Gi
dbp = {(resDogs, dir, tarantino), (resDogs, act, tarantino)}
However, if we ask for all movies directed by and starring Tarantino:
Qdir+act = ({?m}, {(?m, dir, tarantino), (?m, act, tarantino)})
the query completeness Compl(Qdir+act ) is not obtained.
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 16 / 38
Incomplete Data Source
Definition (Incomplete Data Source)
An incomplete data source is a pair of two graphs
G = (Ga, Gi), where Ga ⊆ Gi.
We call Ga the available graph and Gi the ideal graph.
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 17 / 38
Completeness Statement
Definition (Completeness Statement)
Let P1 be a non-empty BGP and P2 a BGP.
A completeness statement is defined as
Compl(P1 | P2)
where we call P1 the pattern and P2 the condition of the statement.
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 18 / 38
Satisfaction of Completeness Statements
To a statement
C = Compl(P1 | P2),
we associate the CONSTRUCT query
QC = (P1, P1 ∪ P2).
Then, we say:
C is satisfied by an incomplete data source G = (Ga, Gi),
written G |= C, if
QC Gi ⊆ Ga
.
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 19 / 38
Completeness Statements in RDF
Cact = Compl((?m, act, ?a) | (?m, dir, tarantino))
lv:dataset a void:Dataset;
c:hasComplStmt lv:csAct.
lv:csAct c:hasPattern [c:subject [c:varName "m"];
c:predicate s:actor;
c:object [c:varName "a"]];
c:hasCondition [c:subject [c:varName "m"];
c:predicate s:director;
c:object lmdb:Quentin_Tarantino].
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 20 / 38
Query Completeness
Definition (Query Completeness)
Let Q be a query. We write
Compl(Q)
to say that Q is complete.
An incomplete data source G = (Ga, Gi) satisfies Compl(Q),
written G |= Compl(Q), if
Q Gi = Q Ga .
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 21 / 38
Completeness Entailment
Problem Definition (Completeness Entailment)
Let C be a set of completeness statements and Q a query.
We say that C entails the completeness of Q, written
C |= Compl(Q),
if any incomplete data source satisfying C also satisfies Compl(Q).
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 22 / 38
Intuition: Completeness Entailment
Consider the set Cdir,act = { Cdir , Cact } of completeness statements
and the query Qdir+act = ({ ?m }, Pdir+act ) where
Pdir+act = { (?m, dir, tarantino), (?m, act, tarantino) }.
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 23 / 38
Intuition: Completeness Entailment
Consider the set Cdir,act = { Cdir , Cact } of completeness statements
and the query Qdir+act = ({ ?m }, Pdir+act ).
˜Pdir+act = { ( ˜m, dir, tarantino), ( ˜m, act, tarantino) }
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 24 / 38
Intuition: Completeness Entailment
Consider the set Cdir,act = { Cdir , Cact } of completeness statements
and the query Qdir+act = ({ ?m }, Pdir+act ).
˜Pdir+act = { ( ˜m, dir, tarantino), ( ˜m, act, tarantino) }
Therefore,
QCdir ˜Pdir+act
∪ QCact ˜Pdir+act
=
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 25 / 38
Intuition: Completeness Entailment
Consider the set Cdir,act = { Cdir , Cact } of completeness statements
and the query Qdir+act = ({ ?m }, Pdir+act ).
˜Pdir+act = { ( ˜m, dir, tarantino), ( ˜m, act, tarantino) }
Therefore,
QCdir ˜Pdir+act
∪ QCact ˜Pdir+act
=
{ ( ˜m, dir, tarantino), ( ˜m, act, tarantino) } =
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 25 / 38
Intuition: Completeness Entailment
Consider the set Cdir,act = { Cdir , Cact } of completeness statements
and the query Qdir+act = ({ ?m }, Pdir+act ).
˜Pdir+act = { ( ˜m, dir, tarantino), ( ˜m, act, tarantino) }
Therefore,
QCdir ˜Pdir+act
∪ QCact ˜Pdir+act
=
{ ( ˜m, dir, tarantino), ( ˜m, act, tarantino) } =
˜Pdir+act .
Thus,
Cdir,act |= Compl(Qdir+act ).
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 25 / 38
Prototypical Graph
˜Pdir+act = { ( ˜m, dir, tarantino), ( ˜m, act, tarantino) }
Definition (Prototypical Graph)
Let Q = (W, P) be a query.
The freeze mapping ˜id is defined as a mapping
from each variable ?v in P to a new IRI ˜v.
Instantiating the graph pattern P with ˜id yields the graph
˜P := ˜id P,
which we call the prototypical graph of Q.
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 26 / 38
Transfer Operator
QCdir ˜Pdir+act
∪ QCact ˜Pdir+act
Definition (Transfer Operator)
For any set C of completeness statements and a graph G,
we define the transfer operator TC that computes the union
of the evaluation over G of all CONSTRUCT queries
of the statements in C:
TC(G) =
C∈ C
QC G
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 27 / 38
Completeness Entailment Theorem
˜Pdir+act = TCdir,act
(˜Pdir+act )
Theorem (Completeness of Basic Queries)
Let C be a set of completeness statements and
Q = (W, P) a basic query. Then,
C |= Compl(Q) if and only if ˜P = TC(˜P).
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 28 / 38
Query Class: DISTINCT Queries
Give us all Oscar-winning things:
Qawd = (Wawd , Pawd )d =
({?m}, { (?m, award, oscar), (?m, award, ?aw) })d
Complete for all Oscar-winning things:
Cos = Compl((?m, award, oscar) | ∅)
{ Cos } |= Compl(Qawd ) holds?
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 29 / 38
Query Class: OPT Queries
Give us all movies, and their awards, if any:
Qmaw = ({ ?m, ?aw }, ((?m, a, Movie) OPT (?m, award, ?aw)))
Complete for all movies and their awards:
Caw = Compl((?m, a, Movie), (?m, award, ?aw) | ∅)
{ Caw } |= Compl(Qmaw ) holds?
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 30 / 38
Query Class: Queries under RDFS Semantics
Give us all films:
Qfilm = ({ ?m }, { (?m, a, Film) })
Complete for all movies:
Cmovie = Compl((?m, a, Movie) | ∅)
Films are the same as movies:
Sfm = {(Film, subclass, Movie), (Movie, subclass, Film)}
{ Cmovie } |= Compl(Qfilm) wrt. Sfm holds?
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 31 / 38
Federated Completeness Statements
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 32 / 38
Timestamped Completeness Statements
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 33 / 38
Conclusions
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 34 / 38
Conclusions
Completeness statements can now be represented in RDF
We know how completeness statements can entail query
completeness in different query classes and
different settings of completeness statements
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 35 / 38
Future Work
Completeness statements for queries with negation
Completeness statements as session annotations
for RDF streams
Statistical completeness reasoning
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 36 / 38
Publications
Fariz Darari, Werner Nutt, Giuseppe Pirrò, Simon Razniewski: Completeness
Statements about RDF Data Sources and Their Use for Query Answering.
ISWC 2013.
Fariz Darari, Radityo Eko Prasojo, Werner Nutt: CORNER: A Completeness
Reasoner for SPARQL Queries Over RDF Data Sources. ESWC Posters and
Demos 2014.
Fariz Darari, Simon Razniewski, Werner Nutt: Bridging the Semantic Gap
between RDF and SPARQL using Completeness Statements. ISWC Posters
and Demos 2014.
Fariz Darari, Radityo Eko Prasojo, Werner Nutt: Expressing No-Value
Information in RDF. ISWC Posters & Demos 2015.
The latest results (timestamped statements and efficient completeness
reasoning with 1 million statements) have been submitted to a journal.
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 37 / 38
Compl((myDaSePresentation, slide, ?s) | ∅)
Thank You!
Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 38 / 38

Mais conteúdo relacionado

Mais de Fariz Darari

Data X Museum - Hari Museum Internasional 2022 - WMID
Data X Museum - Hari Museum Internasional 2022 - WMIDData X Museum - Hari Museum Internasional 2022 - WMID
Data X Museum - Hari Museum Internasional 2022 - WMIDFariz Darari
 
[PUBLIC] quiz-01-midterm-solutions.pdf
[PUBLIC] quiz-01-midterm-solutions.pdf[PUBLIC] quiz-01-midterm-solutions.pdf
[PUBLIC] quiz-01-midterm-solutions.pdfFariz Darari
 
Free AI Kit - Game Theory
Free AI Kit - Game TheoryFree AI Kit - Game Theory
Free AI Kit - Game TheoryFariz Darari
 
Neural Networks and Deep Learning: An Intro
Neural Networks and Deep Learning: An IntroNeural Networks and Deep Learning: An Intro
Neural Networks and Deep Learning: An IntroFariz Darari
 
NLP guest lecture: How to get text to confess what knowledge it has
NLP guest lecture: How to get text to confess what knowledge it hasNLP guest lecture: How to get text to confess what knowledge it has
NLP guest lecture: How to get text to confess what knowledge it hasFariz Darari
 
Supply and Demand - AI Talents
Supply and Demand - AI TalentsSupply and Demand - AI Talents
Supply and Demand - AI TalentsFariz Darari
 
Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02Fariz Darari
 
AI in education done properly
AI in education done properlyAI in education done properly
AI in education done properlyFariz Darari
 
Artificial Neural Networks: Pointers
Artificial Neural Networks: PointersArtificial Neural Networks: Pointers
Artificial Neural Networks: PointersFariz Darari
 
Open Tridharma at ICACSIS 2019
Open Tridharma at ICACSIS 2019Open Tridharma at ICACSIS 2019
Open Tridharma at ICACSIS 2019Fariz Darari
 
Defense Slides of Avicenna Wisesa - PROWD
Defense Slides of Avicenna Wisesa - PROWDDefense Slides of Avicenna Wisesa - PROWD
Defense Slides of Avicenna Wisesa - PROWDFariz Darari
 
Seminar Laporan Aktualisasi - Tridharma Terbuka - Fariz Darari
Seminar Laporan Aktualisasi - Tridharma Terbuka - Fariz DarariSeminar Laporan Aktualisasi - Tridharma Terbuka - Fariz Darari
Seminar Laporan Aktualisasi - Tridharma Terbuka - Fariz DarariFariz Darari
 
Foundations of Programming - Java OOP
Foundations of Programming - Java OOPFoundations of Programming - Java OOP
Foundations of Programming - Java OOPFariz Darari
 
Recursion in Python
Recursion in PythonRecursion in Python
Recursion in PythonFariz Darari
 
Testing in Python: doctest and unittest (Updated)
Testing in Python: doctest and unittest (Updated)Testing in Python: doctest and unittest (Updated)
Testing in Python: doctest and unittest (Updated)Fariz Darari
 
Testing in Python: doctest and unittest
Testing in Python: doctest and unittestTesting in Python: doctest and unittest
Testing in Python: doctest and unittestFariz Darari
 
Dissertation Defense - Managing and Consuming Completeness Information for RD...
Dissertation Defense - Managing and Consuming Completeness Information for RD...Dissertation Defense - Managing and Consuming Completeness Information for RD...
Dissertation Defense - Managing and Consuming Completeness Information for RD...Fariz Darari
 
Research Writing - 2018.07.18
Research Writing - 2018.07.18Research Writing - 2018.07.18
Research Writing - 2018.07.18Fariz Darari
 
KOI - Knowledge Of Incidents - SemEval 2018
KOI - Knowledge Of Incidents - SemEval 2018KOI - Knowledge Of Incidents - SemEval 2018
KOI - Knowledge Of Incidents - SemEval 2018Fariz Darari
 
Comparing Index Structures for Completeness Reasoning
Comparing Index Structures for Completeness ReasoningComparing Index Structures for Completeness Reasoning
Comparing Index Structures for Completeness ReasoningFariz Darari
 

Mais de Fariz Darari (20)

Data X Museum - Hari Museum Internasional 2022 - WMID
Data X Museum - Hari Museum Internasional 2022 - WMIDData X Museum - Hari Museum Internasional 2022 - WMID
Data X Museum - Hari Museum Internasional 2022 - WMID
 
[PUBLIC] quiz-01-midterm-solutions.pdf
[PUBLIC] quiz-01-midterm-solutions.pdf[PUBLIC] quiz-01-midterm-solutions.pdf
[PUBLIC] quiz-01-midterm-solutions.pdf
 
Free AI Kit - Game Theory
Free AI Kit - Game TheoryFree AI Kit - Game Theory
Free AI Kit - Game Theory
 
Neural Networks and Deep Learning: An Intro
Neural Networks and Deep Learning: An IntroNeural Networks and Deep Learning: An Intro
Neural Networks and Deep Learning: An Intro
 
NLP guest lecture: How to get text to confess what knowledge it has
NLP guest lecture: How to get text to confess what knowledge it hasNLP guest lecture: How to get text to confess what knowledge it has
NLP guest lecture: How to get text to confess what knowledge it has
 
Supply and Demand - AI Talents
Supply and Demand - AI TalentsSupply and Demand - AI Talents
Supply and Demand - AI Talents
 
Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02
 
AI in education done properly
AI in education done properlyAI in education done properly
AI in education done properly
 
Artificial Neural Networks: Pointers
Artificial Neural Networks: PointersArtificial Neural Networks: Pointers
Artificial Neural Networks: Pointers
 
Open Tridharma at ICACSIS 2019
Open Tridharma at ICACSIS 2019Open Tridharma at ICACSIS 2019
Open Tridharma at ICACSIS 2019
 
Defense Slides of Avicenna Wisesa - PROWD
Defense Slides of Avicenna Wisesa - PROWDDefense Slides of Avicenna Wisesa - PROWD
Defense Slides of Avicenna Wisesa - PROWD
 
Seminar Laporan Aktualisasi - Tridharma Terbuka - Fariz Darari
Seminar Laporan Aktualisasi - Tridharma Terbuka - Fariz DarariSeminar Laporan Aktualisasi - Tridharma Terbuka - Fariz Darari
Seminar Laporan Aktualisasi - Tridharma Terbuka - Fariz Darari
 
Foundations of Programming - Java OOP
Foundations of Programming - Java OOPFoundations of Programming - Java OOP
Foundations of Programming - Java OOP
 
Recursion in Python
Recursion in PythonRecursion in Python
Recursion in Python
 
Testing in Python: doctest and unittest (Updated)
Testing in Python: doctest and unittest (Updated)Testing in Python: doctest and unittest (Updated)
Testing in Python: doctest and unittest (Updated)
 
Testing in Python: doctest and unittest
Testing in Python: doctest and unittestTesting in Python: doctest and unittest
Testing in Python: doctest and unittest
 
Dissertation Defense - Managing and Consuming Completeness Information for RD...
Dissertation Defense - Managing and Consuming Completeness Information for RD...Dissertation Defense - Managing and Consuming Completeness Information for RD...
Dissertation Defense - Managing and Consuming Completeness Information for RD...
 
Research Writing - 2018.07.18
Research Writing - 2018.07.18Research Writing - 2018.07.18
Research Writing - 2018.07.18
 
KOI - Knowledge Of Incidents - SemEval 2018
KOI - Knowledge Of Incidents - SemEval 2018KOI - Knowledge Of Incidents - SemEval 2018
KOI - Knowledge Of Incidents - SemEval 2018
 
Comparing Index Structures for Completeness Reasoning
Comparing Index Structures for Completeness ReasoningComparing Index Structures for Completeness Reasoning
Comparing Index Structures for Completeness Reasoning
 

Último

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 

Último (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 

Managing Completeness of Web Data

  • 1. Managing Completeness of Web Data Fariz Darari PhD Supervisor: Werner Nutt Supported by the project MAGIC, funded by the province of Bolzano Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 1 / 38
  • 2. About Us Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 2 / 38
  • 3. Research Group Sorted by distance to Werner’s office :) Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 3 / 38
  • 4. Bozen-Bolzano Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 4 / 38
  • 5. Motivation Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 5 / 38
  • 6. Completeness statements are already there Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 6 / 38
  • 7. However . . . Completeness statements are available but only in natural language Unclear what data completeness & query completeness mean No techniques to check whether data completeness entails query completeness Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 7 / 38
  • 8. Solution Ideas Completeness statements are available but only in natural language Solution: RDF-ize completeness statements Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 8 / 38
  • 9. Solution Ideas Completeness statements are available but only in natural language Solution: RDF-ize completeness statements Unclear what data completeness & query completeness mean Solution: Formalize data completeness & query completeness Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 8 / 38
  • 10. Solution Ideas Completeness statements are available but only in natural language Solution: RDF-ize completeness statements Unclear what data completeness & query completeness mean Solution: Formalize data completeness & query completeness No techniques to check whether data completeness entails query completeness Solution: Develop techniques to check whether data completeness entails query completeness Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 8 / 38
  • 11. Solutions Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 9 / 38
  • 12. Background: RDF Grd = { (resDogs, dir, tarantino), (resDogs, act, tarantino) } Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 10 / 38
  • 13. Background: SPARQL SELECT Qsdir = ({ ?m }, { (?m, dir, tarantino) }) ASK Qadir = ({ }, { (?m, dir, tarantino) }) CONSTRUCT Qcdir = ({ (?m, dir, tarantino) }, { (?m, dir, tarantino) }) Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 11 / 38
  • 14. Story: Incomplete Data Source An incomplete data source of Reservoir Dogs, Gdbp = (Ga dbp, Gi dbp): Ga dbp = {(resDogs, dir, tarantino)} Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 12 / 38
  • 15. Story: Incomplete Data Source An incomplete data source of Reservoir Dogs, Gdbp = (Ga dbp, Gi dbp): Gi dbp = {(resDogs, dir, tarantino), (resDogs, act, tarantino)} Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 13 / 38
  • 16. Story: Completeness Statement Ga dbp = {(resDogs, dir, tarantino)} Gi dbp = {(resDogs, dir, tarantino), (resDogs, act, tarantino)} From (Ga dbp, Gi dbp), we can say that DBpedia is complete for movies directed by Tarantino: Cdir = Compl((?m, dir, tarantino) | ∅) Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 14 / 38
  • 17. Story: Completeness Statement Ga dbp = {(resDogs, dir, tarantino)} Gi dbp = {(resDogs, dir, tarantino), (resDogs, act, tarantino)} From (Ga dbp, Gi dbp), we can say that DBpedia is complete for movies directed by Tarantino: Cdir = Compl((?m, dir, tarantino) | ∅) However, it is not complete for actors in movies directed by Tarantino: Cact = Compl((?m, act, ?a) | (?m, dir, tarantino)) Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 14 / 38
  • 18. Story: Query Completeness Ga dbp = {(resDogs, dir, tarantino)} Gi dbp = {(resDogs, dir, tarantino), (resDogs, act, tarantino)} Consequently, when we ask for all movies directed by Tarantino over DBpedia: Qdir = ({?m}, {(?m, dir, tarantino)}) the query completeness Compl(Qdir ) is obtained. Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 15 / 38
  • 19. Story: Query Completeness Ga dbp = {(resDogs, dir, tarantino)} Gi dbp = {(resDogs, dir, tarantino), (resDogs, act, tarantino)} However, if we ask for all movies directed by and starring Tarantino: Qdir+act = ({?m}, {(?m, dir, tarantino), (?m, act, tarantino)}) the query completeness Compl(Qdir+act ) is not obtained. Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 16 / 38
  • 20. Incomplete Data Source Definition (Incomplete Data Source) An incomplete data source is a pair of two graphs G = (Ga, Gi), where Ga ⊆ Gi. We call Ga the available graph and Gi the ideal graph. Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 17 / 38
  • 21. Completeness Statement Definition (Completeness Statement) Let P1 be a non-empty BGP and P2 a BGP. A completeness statement is defined as Compl(P1 | P2) where we call P1 the pattern and P2 the condition of the statement. Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 18 / 38
  • 22. Satisfaction of Completeness Statements To a statement C = Compl(P1 | P2), we associate the CONSTRUCT query QC = (P1, P1 ∪ P2). Then, we say: C is satisfied by an incomplete data source G = (Ga, Gi), written G |= C, if QC Gi ⊆ Ga . Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 19 / 38
  • 23. Completeness Statements in RDF Cact = Compl((?m, act, ?a) | (?m, dir, tarantino)) lv:dataset a void:Dataset; c:hasComplStmt lv:csAct. lv:csAct c:hasPattern [c:subject [c:varName "m"]; c:predicate s:actor; c:object [c:varName "a"]]; c:hasCondition [c:subject [c:varName "m"]; c:predicate s:director; c:object lmdb:Quentin_Tarantino]. Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 20 / 38
  • 24. Query Completeness Definition (Query Completeness) Let Q be a query. We write Compl(Q) to say that Q is complete. An incomplete data source G = (Ga, Gi) satisfies Compl(Q), written G |= Compl(Q), if Q Gi = Q Ga . Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 21 / 38
  • 25. Completeness Entailment Problem Definition (Completeness Entailment) Let C be a set of completeness statements and Q a query. We say that C entails the completeness of Q, written C |= Compl(Q), if any incomplete data source satisfying C also satisfies Compl(Q). Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 22 / 38
  • 26. Intuition: Completeness Entailment Consider the set Cdir,act = { Cdir , Cact } of completeness statements and the query Qdir+act = ({ ?m }, Pdir+act ) where Pdir+act = { (?m, dir, tarantino), (?m, act, tarantino) }. Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 23 / 38
  • 27. Intuition: Completeness Entailment Consider the set Cdir,act = { Cdir , Cact } of completeness statements and the query Qdir+act = ({ ?m }, Pdir+act ). ˜Pdir+act = { ( ˜m, dir, tarantino), ( ˜m, act, tarantino) } Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 24 / 38
  • 28. Intuition: Completeness Entailment Consider the set Cdir,act = { Cdir , Cact } of completeness statements and the query Qdir+act = ({ ?m }, Pdir+act ). ˜Pdir+act = { ( ˜m, dir, tarantino), ( ˜m, act, tarantino) } Therefore, QCdir ˜Pdir+act ∪ QCact ˜Pdir+act = Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 25 / 38
  • 29. Intuition: Completeness Entailment Consider the set Cdir,act = { Cdir , Cact } of completeness statements and the query Qdir+act = ({ ?m }, Pdir+act ). ˜Pdir+act = { ( ˜m, dir, tarantino), ( ˜m, act, tarantino) } Therefore, QCdir ˜Pdir+act ∪ QCact ˜Pdir+act = { ( ˜m, dir, tarantino), ( ˜m, act, tarantino) } = Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 25 / 38
  • 30. Intuition: Completeness Entailment Consider the set Cdir,act = { Cdir , Cact } of completeness statements and the query Qdir+act = ({ ?m }, Pdir+act ). ˜Pdir+act = { ( ˜m, dir, tarantino), ( ˜m, act, tarantino) } Therefore, QCdir ˜Pdir+act ∪ QCact ˜Pdir+act = { ( ˜m, dir, tarantino), ( ˜m, act, tarantino) } = ˜Pdir+act . Thus, Cdir,act |= Compl(Qdir+act ). Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 25 / 38
  • 31. Prototypical Graph ˜Pdir+act = { ( ˜m, dir, tarantino), ( ˜m, act, tarantino) } Definition (Prototypical Graph) Let Q = (W, P) be a query. The freeze mapping ˜id is defined as a mapping from each variable ?v in P to a new IRI ˜v. Instantiating the graph pattern P with ˜id yields the graph ˜P := ˜id P, which we call the prototypical graph of Q. Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 26 / 38
  • 32. Transfer Operator QCdir ˜Pdir+act ∪ QCact ˜Pdir+act Definition (Transfer Operator) For any set C of completeness statements and a graph G, we define the transfer operator TC that computes the union of the evaluation over G of all CONSTRUCT queries of the statements in C: TC(G) = C∈ C QC G Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 27 / 38
  • 33. Completeness Entailment Theorem ˜Pdir+act = TCdir,act (˜Pdir+act ) Theorem (Completeness of Basic Queries) Let C be a set of completeness statements and Q = (W, P) a basic query. Then, C |= Compl(Q) if and only if ˜P = TC(˜P). Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 28 / 38
  • 34. Query Class: DISTINCT Queries Give us all Oscar-winning things: Qawd = (Wawd , Pawd )d = ({?m}, { (?m, award, oscar), (?m, award, ?aw) })d Complete for all Oscar-winning things: Cos = Compl((?m, award, oscar) | ∅) { Cos } |= Compl(Qawd ) holds? Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 29 / 38
  • 35. Query Class: OPT Queries Give us all movies, and their awards, if any: Qmaw = ({ ?m, ?aw }, ((?m, a, Movie) OPT (?m, award, ?aw))) Complete for all movies and their awards: Caw = Compl((?m, a, Movie), (?m, award, ?aw) | ∅) { Caw } |= Compl(Qmaw ) holds? Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 30 / 38
  • 36. Query Class: Queries under RDFS Semantics Give us all films: Qfilm = ({ ?m }, { (?m, a, Film) }) Complete for all movies: Cmovie = Compl((?m, a, Movie) | ∅) Films are the same as movies: Sfm = {(Film, subclass, Movie), (Movie, subclass, Film)} { Cmovie } |= Compl(Qfilm) wrt. Sfm holds? Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 31 / 38
  • 37. Federated Completeness Statements Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 32 / 38
  • 38. Timestamped Completeness Statements Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 33 / 38
  • 39. Conclusions Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 34 / 38
  • 40. Conclusions Completeness statements can now be represented in RDF We know how completeness statements can entail query completeness in different query classes and different settings of completeness statements Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 35 / 38
  • 41. Future Work Completeness statements for queries with negation Completeness statements as session annotations for RDF streams Statistical completeness reasoning Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 36 / 38
  • 42. Publications Fariz Darari, Werner Nutt, Giuseppe Pirrò, Simon Razniewski: Completeness Statements about RDF Data Sources and Their Use for Query Answering. ISWC 2013. Fariz Darari, Radityo Eko Prasojo, Werner Nutt: CORNER: A Completeness Reasoner for SPARQL Queries Over RDF Data Sources. ESWC Posters and Demos 2014. Fariz Darari, Simon Razniewski, Werner Nutt: Bridging the Semantic Gap between RDF and SPARQL using Completeness Statements. ISWC Posters and Demos 2014. Fariz Darari, Radityo Eko Prasojo, Werner Nutt: Expressing No-Value Information in RDF. ISWC Posters & Demos 2015. The latest results (timestamped statements and efficient completeness reasoning with 1 million statements) have been submitted to a journal. Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 37 / 38
  • 43. Compl((myDaSePresentation, slide, ?s) | ∅) Thank You! Fariz Darari (unibz) Managing Completeness of Web Data Oct 20, 2015 38 / 38