SlideShare uma empresa Scribd logo
1 de 14
A Role for Provenance in Quality
Assessment


Chris Baillie, Pete Edwards, and Edoardo Pignotti
c.baillie@abdn.ac.uk
Overview

 Motivation

 Evaluating Data Quality

 A Role for Provenance

 Future work




                    c.baillie@abdn.ac.uk
Motivation

 “we don’t know whether the information we find [on the Web]
  is accurate or not. So we have to teach people how to assess
  what they’ve found’’
       Vint Cerf, 2010


 Web of Documents has become the Web of documents,
  services, data, and people.

 Anyone can publish anything so we need a way to evaluate
  quality.

 We are investigating these issues within the Internet of Things
    Sensors now at the centre of many applications


                         c.baillie@abdn.ac.uk
Example Scenario




             c.baillie@abdn.ac.uk
Evaluating Data Quality
                                                           Quality Scores
                                                           -Quality is a multi-
Entity (and context)                                       dimensional construct
To evaluate quality, we                                        - Accuracy
must examine the                                               - Timeliness
context around data                                            - Relevance
                                  F(E, R) = Q

WIQA Framework
examines data content,                                 Data Requirements
context, and external                                  -Furber and Hepp (2011)
ratings                                                use rules to identify
          (Bizer et al. 2009)                          quality problems




                                c.baillie@abdn.ac.uk
Representing Sensor Observations


 Linked Data: “recommended best practice for exposing,
  sharing, and connecting pieces of data using URIs and RDF”




                      c.baillie@abdn.ac.uk
Performing Quality Assessment




                                         CONSTRUCT {
                                           _:b0 a QualityScore .
                                           _:b0 score ?qs .
                             ( E distanceFromRoute X )
                                           _:b0 dqm:ruleViolation _:b1 .
         Rrelevance =     1-
                                      100 _:b1 a DataRequirementViolation .
                                           _:b1 dqm:affectedInstance ?instance .
                                         } WHERE {
                                           ?instance a Observation .
                                           ?instance distanceFromRoute ?distance .
                                           LET (?qs := (1 - (?distance / 100))) .
                                         }



                        c.baillie@abdn.ac.uk
Quality Assessment Results




              c.baillie@abdn.ac.uk
Observation Provenance
 Provenance is a critical part of observation context

 Describes the entities, agents, and activities involved in
  data creation:
    How was the observation value measured?
    Who controlled the sensing process?
    How has the observation been transformed since it was
     created?


 W3C Prov-O model provides linked data representation
  of provenance
Observation Provenance
                     Entity
                 "Observation 2"


                 wasGeneratedBy
                                      Activity
                                   "Map matching"

                                        used
                                                           Agent
                                                           "Chris"
                                       Entity
                                   "Observation 1"
                                                     wasAssociatedWith

                                   wasGeneratedBy
                                                          Activity
                                                      "Sensing Process"

                                                            used


                                                            Entity
                                                       "iPhoneSensor"
Quality Score Provenance
Work To Date
 Developed Quality Assessment Framework that enables:
    Linked data representation of sensor observations
    Definition of quality requirements using SPARQL rules
    Generation of quality scores via reasoning



Future Work
 Implementation of quality rules that examine provenance
 Investigate quality score re-use
Any questions?




Come and see the IRP demo (D9) to see quality
           assessment in action.
Implementation
                                       Quality Rules
           Observation      Reasoner   Relevance
             Triple          (SPIN)      Rule
             Store
                                       Timeliness
                                          Rule
                  Apache Tomcat         Accuracy
                                          Rule
          Observation        Quality
           Service           Service   Availability
                                         Rule

Mais conteúdo relacionado

Destaque (9)

Unforgetable trip sp2 h
Unforgetable trip sp2 hUnforgetable trip sp2 h
Unforgetable trip sp2 h
 
Evaluating Data Quality using Sensor Metadata and Provenance
Evaluating Data Quality using Sensor Metadata and ProvenanceEvaluating Data Quality using Sensor Metadata and Provenance
Evaluating Data Quality using Sensor Metadata and Provenance
 
Grammar book
Grammar bookGrammar book
Grammar book
 
Connect and combine
Connect and combineConnect and combine
Connect and combine
 
10.mon pr
10.mon pr10.mon pr
10.mon pr
 
11.mon div
11.mon div11.mon div
11.mon div
 
Quality Reasoning in the Semantic Web
Quality Reasoning in the Semantic WebQuality Reasoning in the Semantic Web
Quality Reasoning in the Semantic Web
 
Filtros y oscilador de wien
Filtros y oscilador de wienFiltros y oscilador de wien
Filtros y oscilador de wien
 
Circuitos Digitales - Contador ascendente y descendente con reset
Circuitos Digitales - Contador ascendente y descendente con resetCircuitos Digitales - Contador ascendente y descendente con reset
Circuitos Digitales - Contador ascendente y descendente con reset
 

Semelhante a A Role for Provenance in Quality Assessment

Testing systemqualities agile2012
Testing systemqualities   agile2012Testing systemqualities   agile2012
Testing systemqualities agile2012
drewz lin
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
theijes
 
Top100summit christina
Top100summit christinaTop100summit christina
Top100summit christina
Christina Geng
 
Semantically-Enhanced Recommendation Algorithms
Semantically-Enhanced Recommendation AlgorithmsSemantically-Enhanced Recommendation Algorithms
Semantically-Enhanced Recommendation Algorithms
Luigi Ceccaroni
 
service quality & usability
service quality & usabilityservice quality & usability
service quality & usability
Yves Pigneur
 

Semelhante a A Role for Provenance in Quality Assessment (20)

COBWEB A quality assurance workflow authoring tool for citizen science and cr...
COBWEB A quality assurance workflow authoring tool for citizen science and cr...COBWEB A quality assurance workflow authoring tool for citizen science and cr...
COBWEB A quality assurance workflow authoring tool for citizen science and cr...
 
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...
 
IoT 2010 Talk on System Infrastructure for the Internet of Things.
IoT 2010 Talk on System Infrastructure for the  Internet of Things.IoT 2010 Talk on System Infrastructure for the  Internet of Things.
IoT 2010 Talk on System Infrastructure for the Internet of Things.
 
Kliment ppt gi2011_testing_remote_final
Kliment ppt gi2011_testing_remote_finalKliment ppt gi2011_testing_remote_final
Kliment ppt gi2011_testing_remote_final
 
Using Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality AssessmentUsing Web Data Provenance for Quality Assessment
Using Web Data Provenance for Quality Assessment
 
Testing systemqualities agile2012
Testing systemqualities   agile2012Testing systemqualities   agile2012
Testing systemqualities agile2012
 
Testing System Qualities Agile2012 by Rebecca Wirfs-Brock and Joseph Yoder
Testing System Qualities Agile2012 by Rebecca Wirfs-Brock and Joseph YoderTesting System Qualities Agile2012 by Rebecca Wirfs-Brock and Joseph Yoder
Testing System Qualities Agile2012 by Rebecca Wirfs-Brock and Joseph Yoder
 
February 2010 8 Things You Cant Afford To Ignore About eDiscovery
February 2010 8 Things You Cant Afford To Ignore About eDiscoveryFebruary 2010 8 Things You Cant Afford To Ignore About eDiscovery
February 2010 8 Things You Cant Afford To Ignore About eDiscovery
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Pr 005 qa_workshop
Pr 005 qa_workshopPr 005 qa_workshop
Pr 005 qa_workshop
 
Top100summit christina
Top100summit christinaTop100summit christina
Top100summit christina
 
Ca partner day - qualità servizi - roma 2 di 2
Ca partner day - qualità servizi - roma 2 di 2Ca partner day - qualità servizi - roma 2 di 2
Ca partner day - qualità servizi - roma 2 di 2
 
MED301 Is My CDN Performing? - AWS re: Invent 2012
MED301 Is My CDN Performing? - AWS re: Invent 2012MED301 Is My CDN Performing? - AWS re: Invent 2012
MED301 Is My CDN Performing? - AWS re: Invent 2012
 
Cloud Computing for Developers and Architects - QCon 2008 Tutorial
Cloud Computing for Developers and Architects - QCon 2008 TutorialCloud Computing for Developers and Architects - QCon 2008 Tutorial
Cloud Computing for Developers and Architects - QCon 2008 Tutorial
 
Knowledge mobilization
Knowledge mobilization Knowledge mobilization
Knowledge mobilization
 
Albert Simard - Mobilizing Knowledge: Acquisition, Analysis, and Action
Albert Simard - Mobilizing Knowledge: Acquisition, Analysis, and ActionAlbert Simard - Mobilizing Knowledge: Acquisition, Analysis, and Action
Albert Simard - Mobilizing Knowledge: Acquisition, Analysis, and Action
 
Semantically-Enhanced Recommendation Algorithms
Semantically-Enhanced Recommendation AlgorithmsSemantically-Enhanced Recommendation Algorithms
Semantically-Enhanced Recommendation Algorithms
 
Hypothesis Based Testing: Power + Speed.
Hypothesis Based Testing: Power + Speed.Hypothesis Based Testing: Power + Speed.
Hypothesis Based Testing: Power + Speed.
 
Industrialized Linked Data
Industrialized Linked DataIndustrialized Linked Data
Industrialized Linked Data
 
service quality & usability
service quality & usabilityservice quality & usability
service quality & usability
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 

A Role for Provenance in Quality Assessment

  • 1. A Role for Provenance in Quality Assessment Chris Baillie, Pete Edwards, and Edoardo Pignotti c.baillie@abdn.ac.uk
  • 2. Overview  Motivation  Evaluating Data Quality  A Role for Provenance  Future work c.baillie@abdn.ac.uk
  • 3. Motivation  “we don’t know whether the information we find [on the Web] is accurate or not. So we have to teach people how to assess what they’ve found’’ Vint Cerf, 2010  Web of Documents has become the Web of documents, services, data, and people.  Anyone can publish anything so we need a way to evaluate quality.  We are investigating these issues within the Internet of Things  Sensors now at the centre of many applications c.baillie@abdn.ac.uk
  • 4. Example Scenario c.baillie@abdn.ac.uk
  • 5. Evaluating Data Quality Quality Scores -Quality is a multi- Entity (and context) dimensional construct To evaluate quality, we - Accuracy must examine the - Timeliness context around data - Relevance F(E, R) = Q WIQA Framework examines data content, Data Requirements context, and external -Furber and Hepp (2011) ratings use rules to identify (Bizer et al. 2009) quality problems c.baillie@abdn.ac.uk
  • 6. Representing Sensor Observations  Linked Data: “recommended best practice for exposing, sharing, and connecting pieces of data using URIs and RDF” c.baillie@abdn.ac.uk
  • 7. Performing Quality Assessment CONSTRUCT { _:b0 a QualityScore . _:b0 score ?qs . ( E distanceFromRoute X ) _:b0 dqm:ruleViolation _:b1 . Rrelevance = 1- 100 _:b1 a DataRequirementViolation . _:b1 dqm:affectedInstance ?instance . } WHERE { ?instance a Observation . ?instance distanceFromRoute ?distance . LET (?qs := (1 - (?distance / 100))) . } c.baillie@abdn.ac.uk
  • 8. Quality Assessment Results c.baillie@abdn.ac.uk
  • 9. Observation Provenance  Provenance is a critical part of observation context  Describes the entities, agents, and activities involved in data creation:  How was the observation value measured?  Who controlled the sensing process?  How has the observation been transformed since it was created?  W3C Prov-O model provides linked data representation of provenance
  • 10. Observation Provenance Entity "Observation 2" wasGeneratedBy Activity "Map matching" used Agent "Chris" Entity "Observation 1" wasAssociatedWith wasGeneratedBy Activity "Sensing Process" used Entity "iPhoneSensor"
  • 12. Work To Date  Developed Quality Assessment Framework that enables:  Linked data representation of sensor observations  Definition of quality requirements using SPARQL rules  Generation of quality scores via reasoning Future Work  Implementation of quality rules that examine provenance  Investigate quality score re-use
  • 13. Any questions? Come and see the IRP demo (D9) to see quality assessment in action.
  • 14. Implementation Quality Rules Observation Reasoner Relevance Triple (SPIN) Rule Store Timeliness Rule Apache Tomcat Accuracy Rule Observation Quality Service Service Availability Rule

Notas do Editor

  1. In this talk I will outline: why the need for quality assessment exists describe how quality is perceived outline our approach to quality assessment provide an example scenario and outline our future work.
  2. Don’t know whether information is accuracte: need to assess! Web has evolved. Web = open platform. Web is big, need smaller platform for eval.
  3. Consider mobile phones providing passenger information regarding the location of buses. Sometimes we get lucky and observations land right on the bus route. However, there are many different sources of low quality data. Inaccurate GPS readings… Malicious users… someone playing with the app while at home People that make mistakes… someone perhaps on the wrong bus…
  4. Animate this ObservationValue ->[Motivate SSN here] Observation + foi -> disruption report
  5. DataRequirement1 -> wasAttributedTo -> Agent