SlideShare uma empresa Scribd logo
1 de 43
> LOP – Capturing and Linking
Open Provenance on LOD Cycle
Rogers R. de Mendonça, Jonas F. S. M. De La Cerda, Kelli F. de Cordeiro
Sérgio M. S. da Cruz, Maria Cláudia Cavalcanti, Maria Luiza M. Campos
5th Internacional Workshop on
Semantic Web Information Management
SWIM 2013
New York, USA – June 23, 2013
>Outline
Introduction
– Provenance
– Linked Open Data Lifecycle
An Approach for Linked Open Provenance Capture
– Data Preparation and Transformation Process– Data Preparation and Transformation Process
– Data Interlinking Process
– Linked Open Provenance Architecture
– Usage Scenario
Conclusion
– Contributions
– Future Works
>Increase of the Web of Data
What about
data reliability and quality ?
>
Information about the history of the data:
– Where did the data come from?
– Who designed the publishing process?
– Who executed the publishing process?
– Which operations were applied to the data?
Provenance
Importance to the Web of Data:
– Support quality and reliability assessment of the
published data
>Semantic Web Stack
Provenance
W3C®
>
Provenance data available according to LOD principles:
1. Use URIs as names for things
2. Use HTTP URIs, so that people can look up those
names
3. When someone looks up a URI, provide useful
information, using the standards (RDF, SPARQL)
Linked Open Provenance (LOP)
information, using the standards (RDF, SPARQL)
4. Include links to other URIs, so that they can discover
more things
>Related Works
Ontologies / Vocabularies
– PROV-O (PROV-DM)
http://purl.org/net/opmv/ns
– OPMV (OPM)
http://www.w3.org/TR/prov-o/http://www.w3.org/TR/prov-o/
– Cogs (ETL)
http://vocab.deri.ie/cogs
– Dublin Core Metadata Terms , FOAF
>Related Works
Use of provenance to support quality and reliability
assessment of published data
– Provenance Information in the Web of Data (HARTIG,
2009)
– Managing the life-cycle of linked data with the LOD2
stack. (AUER et al, 2012)stack. (AUER et al, 2012)
– Linked Data Quality Assessment and Fusion
(MENDES et al, 2012)
Focus on metadata about the source and access of the
data
>
Interlinking
EnrichmentAuthoring
Linked Open Data Lifecycle
Quality
Evolution
Exploration
Extraction
Storage
LOD2
>
Interlinking
EnrichmentAuthoring
Quality Phase
Quality assessment
Quality
Evolution
Exploration
Extraction
Storage
LOD2
Quality assessment
>
Interlinking
EnrichmentAuthoring
Interlinking Phase
Create and maintain links
Quality assessment
Quality
Evolution
Exploration
Extraction
Storage
LOD2
Quality assessment
>
Interlinking
EnrichmentAuthoring
Extraction Phase
Create and maintain links
Quality assessment
Quality
Evolution
Exploration
Extraction
Storage
LOD2
Quality assessment
Extract and triplify data
>
Interlinking
EnrichmentAuthoring
Extension of Extraction Phase
Create and maintain links
Quality assessment
Quality
Evolution
Exploration
Extraction
Storage
Preparation
LOD2
Quality assessment
>
Interlinking
EnrichmentAuthoring
Extension: Preparation Before Triplification
Create and maintain links
Quality assessment
Quality
Evolution
Exploration
Extraction
Storage
Preparation
LOD2
Quality assessment
Extract, prepare and triplify data
>Data Publishing and Interlinking Process
>Data Publishing and Interlinking Process
Extraction Phase
>Data Preparation and Transformation Process
Heterogeneous
Data Sources
Triplify
Extract
Clean
Conform
Pre-Integrate
Data Preparation and Transformation
Process
ETL (Extraction-Tranformation-Loading) approach:ETL (Extraction-Tranformation-Loading) approach:
– Foundation of DW systems
– Its techniques and tools have been developed and
refined over many years in challenging BI scenarios
– It is very advantageous to inherit the potential of
theses techniques and tools to publish LOD and LOP
>Data Preparation and Transformation Process
Heterogeneous
Data Sources
Triplify
Extract
Clean
Conform
Pre-Integrate
Data Preparation and Transformation
Process
Use of a workflow to have:Use of a workflow to have:
– Systematization of the publishing process
– Monitoring and management of the several tasks
– Facilities for reusing the process
Pentaho Data Integration (a.k.a. Kettle)
– Open source, large community of users, extensible
>Data Publishing and Interlinking Process
Extraction Phase
Interlinking Phase
>Data Interlinking Process
Data Interlinking Process
Web Data
Access
Schema
Mappings
Identity
Resolution
Quality
Evaluator
>Data Interlinking Process
Data Interlinking Process
Web Data
Access
Schema
Mappings
Identity
Resolution
Quality
Evaluator
Extracts data from its original sources
>Data Interlinking Process
Data Interlinking Process
Web Data
Access
Schema
Mappings
Identity
Resolution
Quality
Evaluator
Matches corresponding terms of
multiple vocabularies
>Data Interlinking Process
Data Interlinking Process
Web Data
Access
Schema
Mappings
Identity
Resolution
Quality
Evaluator
Finds and links similar resources on
different datasets
>Data Interlinking Process
Data Interlinking Process
Web Data
Access
Schema
Mappings
Identity
Resolution
Quality
Evaluator
Evaluates data quality based on a set
of rules
>Provenance Oportunity
Data Interlinking Process
Heterogeneous
Data Sources
Triplify
Extract
Clean
Conform
Pre-Integrate
Data Preparation and Transformation
Process
All steps need heavy parameterization and produce a
lot of results
– Employed parameter values and techniques as well
as results obtained are all provenance data
Web Data
Access
Schema
Mappings
Identity
Resolution
Quality
Evaluator
>Linked Open Provenance Architecture
>Data Interlinking Scenarios
>Implementation of PGA
Provenance Gathering Agent
RDF Triple
Triple StoreTriple Store
Provenance
Data
Staging DatabaseStaging Database
>Implementation of PGA
The andThe PGA wraps the ETL process and
stores provenance in data staging
tables to be further extracted,
RDF Triple
Triple StoreTriple Store
Provenance
Data
Staging DatabaseStaging Database
tables to be further extracted,
triplified and loaded to the triple store
by other specific steps, developed
through Kettle API and Linked Open
Data frameworks
>Implementation of PGA
Web Data Access
Schema MappingsSchema Mappings
Identity Resolution
Provenance Gathering Agent was
implemented as a web service
written in Scala (www.scala-lang.org)
Provenance Gathering Agent was
implemented as a web service
written in Scala (www.scala-lang.org)
>Use Case Scenario
>Use Case Scenario
CNPq = Brazilian governmental organization
responsible for fostering scientific research
RNP = Brazilian governmental organization
that finances research projects
>Use Case Scenario – First Part
>Use Case Scenario – First Part
>
SELECT ?group_name ?project_name ?researcher_uri ?process_name
FROM NAMED <http://linkgraph.provenance.br>
FROM NAMED <http://datagraph.provenance.br>
FROM NAMED <http://www.cnpq.br>
FROM NAMED <http://lattes.cnpq.br>
WHERE
{
GRAPH <http://linkgraph.provenance.br> {
?row_uri provprop:cnpqResearchGroup ?group_uri .
?row_uri provprop:lattesProject ?project_uri .
?row_uri provprop:lattesResearcher ?researcher_uri . }
GRAPH <http://datagraph.provenance.br> {
Gets researcher’s groups,
projects and researchers
from data graphs of domain
dataset
Querying Linked Open Provenance
GRAPH <http://datagraph.provenance.br> {
?row_uri opmv:wasGeneratedBy ?process_uri .
?process_uri provprop:composition ?process_def_uri .
?process_def_uri dcterms:title ?process_name . }
GRAPH <http://www.cnpq.br> {
?group_uri cnpq:project ?project_uri .
?group_uri foaf:name ?group_name . }
GRAPH <http://lattes.cnpq.br> {
?project_uri foaf:name ?project_name .
?researcher_uri foaf:name ?researcher_name . }
}
Data, that were in differents datasources of the CNPq
organization, are now integrated in the Web of Data.
>Querying Linked Open Provenance
SELECT ?group_name ?project_name ?researcher_uri ?process_name
FROM NAMED <http://linkgraph.provenance.br>
FROM NAMED <http://datagraph.provenance.br>
FROM NAMED <http://www.cnpq.br>
FROM NAMED <http://lattes.cnpq.br>
WHERE
{
GRAPH <http://linkgraph.provenance.br> {
?row_uri provprop:cnpqResearchGroup ?group_uri .
?row_uri provprop:lattesProject ?project_uri .
?row_uri provprop:lattesResearcher ?researcher_uri . }
GRAPH <http://datagraph.provenance.br> {
Also gets the integration
process from provenance
graphs of Linked Open
Provenance dataset
GRAPH <http://datagraph.provenance.br> {
?row_uri opmv:wasGeneratedBy ?process_uri .
?process_uri provprop:composition ?process_def_uri .
?process_def_uri dcterms:title ?process_name . }
GRAPH <http://www.cnpq.br> {
?group_uri cnpq:project ?project_uri .
?group_uri foaf:name ?group_name . }
GRAPH <http://lattes.cnpq.br> {
?project_uri foaf:name ?project_name .
?researcher_uri foaf:name ?researcher_name . }
}
>
group_name project_name research_uri process_name
"GRECO - Grupo
Engenharia do
Conhecimento"@pt
"LinkedDataBR -
Exposição,
compartilhamento e
http://lattes.cn
pq.br/resourc
e/Researcher/
"Merge CNPq
Research Groups
x Lattes Projects"
Querying Linked Open Provenance
Conhecimento"@pt compartilhamento e
conexão de recursos de
dados abertos na Web
(Linked Open Data)"@pt
e/Researcher/
K4781460T3
x Lattes Projects"
"GRECO - Grupo
Engenharia do
Conhecimento"@pt
"Núcleo de Pesquisa de
Sistemas Computacionais
Complexos para a Gestão
de Emergências"@pt
http://lattes.cn
pq.br/resourc
e/Researcher/
K4717449A7
"Merge CNPq
Research Groups
x Lattes Projects"
"GRECO - Grupo
Engenharia do
Conhecimento"@pt
"Identificação e Análise de
Redes Sociais
Complexas"@pt
http://lattes.cn
pq.br/resourc
e/Researcher/
K4761314U5
"Merge CNPq
Research Groups
x Lattes Projects"
>Use Case Scenario – Second Part
>Use Case Scenario – Second Part
>Use Case Scenario – Provenance Evaluation
At the end of the execution of both processes, a
SPARQL query could be used to ask: “At which
projects does a researcher work?”
The result would include projects declared in the CNPq
dataset and in the RNP datasetdataset and in the RNP dataset
If the projects returned by CNPq diverges from RNP, it
is possible to investigate the cause by querying and
evaluating LOP data
>Conclusion - Contributions
New strategy to provide provenance for data and links
of Web of Data
LOD cycle is extended with a systematic data
preparation and transformation process, supported by
an ETL workflow frameworkan ETL workflow framework
Provenance data is available according to LOD
principles (Linked Open Provenance)
>Conclusion – Future works
Development of provenance query interface
– Take advantage of LOP and support its exploration
Development / evolution of a provenance ontology
– Today, we are using a combination of vocabularies
Investigation in the area of Big Data
– Fine-grained provenance generates large volumes of
data
>Thank You !
LOP – Capturing and Linking Open
Provenance on LOD Cycle
Rogers R. de Mendonça 1
rogers@ufrj.br
Jonas F. S. M. De La Cerda 2
jonas.ferreira@uniriotec.br
Kelli F. de Cordeiro 1
kelli@ufrj.br
Sérgio M. S. da Cruz 3
serra@ufrrj.br
Maria Cláudia Cavalcanti 2
yoko@ime.eb.br
Maria Luiza M. Campos 1
mluiza@ppgi.ufrj.br
1 Federal University of
Rio de Janeiro - UFRJ
2 Military Institute of
Engineering - IME
3 Federal Rural University
of Rio de Janeiro - UFRRJ

Mais conteúdo relacionado

Mais procurados

Reuse of Structured Data: Semantics, Linkage, and Realization
Reuse of Structured Data: Semantics, Linkage, and RealizationReuse of Structured Data: Semantics, Linkage, and Realization
Reuse of Structured Data: Semantics, Linkage, and Realizationandrea huang
 
Describing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core VocabularyDescribing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core VocabularyNandana Mihindukulasooriya
 
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...Yongyao Jiang
 
Linked data HHS 2015
Linked data HHS 2015Linked data HHS 2015
Linked data HHS 2015Cason Snow
 
Creating Linked Data from Relational Databases
Creating Linked Data from Relational DatabasesCreating Linked Data from Relational Databases
Creating Linked Data from Relational DatabasesNikolaos Konstantinou
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
 
How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?andrea huang
 
WWW2014 Overview of W3C Linked Data Platform 20140410
WWW2014 Overview of W3C Linked Data Platform 20140410WWW2014 Overview of W3C Linked Data Platform 20140410
WWW2014 Overview of W3C Linked Data Platform 20140410Arnaud Le Hors
 
Big Linked Data - Creating Training Curricula
Big Linked Data - Creating Training CurriculaBig Linked Data - Creating Training Curricula
Big Linked Data - Creating Training CurriculaEUCLID project
 
Linked data MLA 2015
Linked data MLA 2015Linked data MLA 2015
Linked data MLA 2015Cason Snow
 
Linked Data MLA 2015
Linked Data MLA 2015Linked Data MLA 2015
Linked Data MLA 2015Cason Snow
 
How to expose research data in EOSC
How to expose research data in EOSCHow to expose research data in EOSC
How to expose research data in EOSCEUDAT
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...Armin Haller
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
 
Linking Open Government Data at Scale
Linking Open Government Data at Scale Linking Open Government Data at Scale
Linking Open Government Data at Scale Bernadette Hyland-Wood
 
euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)Besnik Fetahu
 
Bio2RDF Distributed Querying model
Bio2RDF Distributed Querying modelBio2RDF Distributed Querying model
Bio2RDF Distributed Querying modelPeter Ansell
 
Introduction to Web Services
Introduction to Web ServicesIntroduction to Web Services
Introduction to Web ServicesJeffrey Anderson
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataGiorgos Santipantakis
 

Mais procurados (20)

Reuse of Structured Data: Semantics, Linkage, and Realization
Reuse of Structured Data: Semantics, Linkage, and RealizationReuse of Structured Data: Semantics, Linkage, and Realization
Reuse of Structured Data: Semantics, Linkage, and Realization
 
Describing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core VocabularyDescribing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core Vocabulary
 
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
 
Linked data HHS 2015
Linked data HHS 2015Linked data HHS 2015
Linked data HHS 2015
 
Creating Linked Data from Relational Databases
Creating Linked Data from Relational DatabasesCreating Linked Data from Relational Databases
Creating Linked Data from Relational Databases
 
Providing Linked Data
Providing Linked DataProviding Linked Data
Providing Linked Data
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?
 
WWW2014 Overview of W3C Linked Data Platform 20140410
WWW2014 Overview of W3C Linked Data Platform 20140410WWW2014 Overview of W3C Linked Data Platform 20140410
WWW2014 Overview of W3C Linked Data Platform 20140410
 
Big Linked Data - Creating Training Curricula
Big Linked Data - Creating Training CurriculaBig Linked Data - Creating Training Curricula
Big Linked Data - Creating Training Curricula
 
Linked data MLA 2015
Linked data MLA 2015Linked data MLA 2015
Linked data MLA 2015
 
Linked Data MLA 2015
Linked Data MLA 2015Linked Data MLA 2015
Linked Data MLA 2015
 
How to expose research data in EOSC
How to expose research data in EOSCHow to expose research data in EOSC
How to expose research data in EOSC
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
Linking Open Government Data at Scale
Linking Open Government Data at Scale Linking Open Government Data at Scale
Linking Open Government Data at Scale
 
euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)
 
Bio2RDF Distributed Querying model
Bio2RDF Distributed Querying modelBio2RDF Distributed Querying model
Bio2RDF Distributed Querying model
 
Introduction to Web Services
Introduction to Web ServicesIntroduction to Web Services
Introduction to Web Services
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 

Semelhante a LOP – Capturing and Linking Open Provenance on LOD Cycle

ALIGNED Data Curation Methods and Tools
ALIGNED Data Curation Methods and ToolsALIGNED Data Curation Methods and Tools
ALIGNED Data Curation Methods and ToolsAlignedProject
 
Data Quality
Data QualityData Quality
Data Qualityjerdeb
 
Linked Energy Data Generation
Linked Energy Data GenerationLinked Energy Data Generation
Linked Energy Data GenerationFilip Radulovic
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapubeswcsummerschool
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataeXascale Infolab
 
Scientific Data Stewardship Maturity Matrix
Scientific Data Stewardship Maturity MatrixScientific Data Stewardship Maturity Matrix
Scientific Data Stewardship Maturity MatrixGe Peng
 
THOR Workshop - Persistent Identifier Linking
THOR Workshop - Persistent Identifier LinkingTHOR Workshop - Persistent Identifier Linking
THOR Workshop - Persistent Identifier LinkingMaaike Duine
 
Dataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsDataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsMerce Crosas
 
Resilient Linked Data
Resilient Linked DataResilient Linked Data
Resilient Linked DataDave Reynolds
 
A candidate dataset_discovery_and_linkage_recommendation_system_for_linked_data
A candidate dataset_discovery_and_linkage_recommendation_system_for_linked_dataA candidate dataset_discovery_and_linkage_recommendation_system_for_linked_data
A candidate dataset_discovery_and_linkage_recommendation_system_for_linked_dataSTIinnsbruck
 
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...Anastasija Nikiforova
 
Linked Data for Information Extraction Challenge - Tasks and Results @ ISWC 2014
Linked Data for Information Extraction Challenge - Tasks and Results @ ISWC 2014Linked Data for Information Extraction Challenge - Tasks and Results @ ISWC 2014
Linked Data for Information Extraction Challenge - Tasks and Results @ ISWC 2014Robert Meusel
 
Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695Ge Peng
 
2010 CLARA Nijmegen - Data Seal of Approval tutorial
2010 CLARA Nijmegen - Data Seal of Approval tutorial2010 CLARA Nijmegen - Data Seal of Approval tutorial
2010 CLARA Nijmegen - Data Seal of Approval tutorialDirk Roorda
 
BioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge DiscoveryBioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge DiscoveryWolfgang G. Hoeck
 
An Extensible Framework to Validate and Build Dataset Profiles
An Extensible Framework to Validate and Build Dataset ProfilesAn Extensible Framework to Validate and Build Dataset Profiles
An Extensible Framework to Validate and Build Dataset ProfilesAhmad Assaf
 
(ATS6-APP01) Unleashing the Power of Your Data with Discoverant
(ATS6-APP01) Unleashing the Power of Your Data with Discoverant(ATS6-APP01) Unleashing the Power of Your Data with Discoverant
(ATS6-APP01) Unleashing the Power of Your Data with DiscoverantBIOVIA
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Rakebul Hasan
 
Testing insights from data lakes
Testing insights from data lakesTesting insights from data lakes
Testing insights from data lakesshivindkaur
 
Modeling Data Life Cycles with PROV
Modeling Data Life Cycles with PROVModeling Data Life Cycles with PROV
Modeling Data Life Cycles with PROVEUDAT
 

Semelhante a LOP – Capturing and Linking Open Provenance on LOD Cycle (20)

ALIGNED Data Curation Methods and Tools
ALIGNED Data Curation Methods and ToolsALIGNED Data Curation Methods and Tools
ALIGNED Data Curation Methods and Tools
 
Data Quality
Data QualityData Quality
Data Quality
 
Linked Energy Data Generation
Linked Energy Data GenerationLinked Energy Data Generation
Linked Energy Data Generation
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapub
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked Data
 
Scientific Data Stewardship Maturity Matrix
Scientific Data Stewardship Maturity MatrixScientific Data Stewardship Maturity Matrix
Scientific Data Stewardship Maturity Matrix
 
THOR Workshop - Persistent Identifier Linking
THOR Workshop - Persistent Identifier LinkingTHOR Workshop - Persistent Identifier Linking
THOR Workshop - Persistent Identifier Linking
 
Dataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsDataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTags
 
Resilient Linked Data
Resilient Linked DataResilient Linked Data
Resilient Linked Data
 
A candidate dataset_discovery_and_linkage_recommendation_system_for_linked_data
A candidate dataset_discovery_and_linkage_recommendation_system_for_linked_dataA candidate dataset_discovery_and_linkage_recommendation_system_for_linked_data
A candidate dataset_discovery_and_linkage_recommendation_system_for_linked_data
 
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
 
Linked Data for Information Extraction Challenge - Tasks and Results @ ISWC 2014
Linked Data for Information Extraction Challenge - Tasks and Results @ ISWC 2014Linked Data for Information Extraction Challenge - Tasks and Results @ ISWC 2014
Linked Data for Information Extraction Challenge - Tasks and Results @ ISWC 2014
 
Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695
 
2010 CLARA Nijmegen - Data Seal of Approval tutorial
2010 CLARA Nijmegen - Data Seal of Approval tutorial2010 CLARA Nijmegen - Data Seal of Approval tutorial
2010 CLARA Nijmegen - Data Seal of Approval tutorial
 
BioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge DiscoveryBioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
 
An Extensible Framework to Validate and Build Dataset Profiles
An Extensible Framework to Validate and Build Dataset ProfilesAn Extensible Framework to Validate and Build Dataset Profiles
An Extensible Framework to Validate and Build Dataset Profiles
 
(ATS6-APP01) Unleashing the Power of Your Data with Discoverant
(ATS6-APP01) Unleashing the Power of Your Data with Discoverant(ATS6-APP01) Unleashing the Power of Your Data with Discoverant
(ATS6-APP01) Unleashing the Power of Your Data with Discoverant
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
 
Testing insights from data lakes
Testing insights from data lakesTesting insights from data lakes
Testing insights from data lakes
 
Modeling Data Life Cycles with PROV
Modeling Data Life Cycles with PROVModeling Data Life Cycles with PROV
Modeling Data Life Cycles with PROV
 

Último

So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sectoritnewsafrica
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessWSO2
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 

Último (20)

So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with Platformless
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 

LOP – Capturing and Linking Open Provenance on LOD Cycle

  • 1. > LOP – Capturing and Linking Open Provenance on LOD Cycle Rogers R. de Mendonça, Jonas F. S. M. De La Cerda, Kelli F. de Cordeiro Sérgio M. S. da Cruz, Maria Cláudia Cavalcanti, Maria Luiza M. Campos 5th Internacional Workshop on Semantic Web Information Management SWIM 2013 New York, USA – June 23, 2013
  • 2. >Outline Introduction – Provenance – Linked Open Data Lifecycle An Approach for Linked Open Provenance Capture – Data Preparation and Transformation Process– Data Preparation and Transformation Process – Data Interlinking Process – Linked Open Provenance Architecture – Usage Scenario Conclusion – Contributions – Future Works
  • 3. >Increase of the Web of Data What about data reliability and quality ?
  • 4. > Information about the history of the data: – Where did the data come from? – Who designed the publishing process? – Who executed the publishing process? – Which operations were applied to the data? Provenance Importance to the Web of Data: – Support quality and reliability assessment of the published data
  • 6. > Provenance data available according to LOD principles: 1. Use URIs as names for things 2. Use HTTP URIs, so that people can look up those names 3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) Linked Open Provenance (LOP) information, using the standards (RDF, SPARQL) 4. Include links to other URIs, so that they can discover more things
  • 7. >Related Works Ontologies / Vocabularies – PROV-O (PROV-DM) http://purl.org/net/opmv/ns – OPMV (OPM) http://www.w3.org/TR/prov-o/http://www.w3.org/TR/prov-o/ – Cogs (ETL) http://vocab.deri.ie/cogs – Dublin Core Metadata Terms , FOAF
  • 8. >Related Works Use of provenance to support quality and reliability assessment of published data – Provenance Information in the Web of Data (HARTIG, 2009) – Managing the life-cycle of linked data with the LOD2 stack. (AUER et al, 2012)stack. (AUER et al, 2012) – Linked Data Quality Assessment and Fusion (MENDES et al, 2012) Focus on metadata about the source and access of the data
  • 9. > Interlinking EnrichmentAuthoring Linked Open Data Lifecycle Quality Evolution Exploration Extraction Storage LOD2
  • 11. > Interlinking EnrichmentAuthoring Interlinking Phase Create and maintain links Quality assessment Quality Evolution Exploration Extraction Storage LOD2 Quality assessment
  • 12. > Interlinking EnrichmentAuthoring Extraction Phase Create and maintain links Quality assessment Quality Evolution Exploration Extraction Storage LOD2 Quality assessment Extract and triplify data
  • 13. > Interlinking EnrichmentAuthoring Extension of Extraction Phase Create and maintain links Quality assessment Quality Evolution Exploration Extraction Storage Preparation LOD2 Quality assessment
  • 14. > Interlinking EnrichmentAuthoring Extension: Preparation Before Triplification Create and maintain links Quality assessment Quality Evolution Exploration Extraction Storage Preparation LOD2 Quality assessment Extract, prepare and triplify data
  • 15. >Data Publishing and Interlinking Process
  • 16. >Data Publishing and Interlinking Process Extraction Phase
  • 17. >Data Preparation and Transformation Process Heterogeneous Data Sources Triplify Extract Clean Conform Pre-Integrate Data Preparation and Transformation Process ETL (Extraction-Tranformation-Loading) approach:ETL (Extraction-Tranformation-Loading) approach: – Foundation of DW systems – Its techniques and tools have been developed and refined over many years in challenging BI scenarios – It is very advantageous to inherit the potential of theses techniques and tools to publish LOD and LOP
  • 18. >Data Preparation and Transformation Process Heterogeneous Data Sources Triplify Extract Clean Conform Pre-Integrate Data Preparation and Transformation Process Use of a workflow to have:Use of a workflow to have: – Systematization of the publishing process – Monitoring and management of the several tasks – Facilities for reusing the process Pentaho Data Integration (a.k.a. Kettle) – Open source, large community of users, extensible
  • 19. >Data Publishing and Interlinking Process Extraction Phase Interlinking Phase
  • 20. >Data Interlinking Process Data Interlinking Process Web Data Access Schema Mappings Identity Resolution Quality Evaluator
  • 21. >Data Interlinking Process Data Interlinking Process Web Data Access Schema Mappings Identity Resolution Quality Evaluator Extracts data from its original sources
  • 22. >Data Interlinking Process Data Interlinking Process Web Data Access Schema Mappings Identity Resolution Quality Evaluator Matches corresponding terms of multiple vocabularies
  • 23. >Data Interlinking Process Data Interlinking Process Web Data Access Schema Mappings Identity Resolution Quality Evaluator Finds and links similar resources on different datasets
  • 24. >Data Interlinking Process Data Interlinking Process Web Data Access Schema Mappings Identity Resolution Quality Evaluator Evaluates data quality based on a set of rules
  • 25. >Provenance Oportunity Data Interlinking Process Heterogeneous Data Sources Triplify Extract Clean Conform Pre-Integrate Data Preparation and Transformation Process All steps need heavy parameterization and produce a lot of results – Employed parameter values and techniques as well as results obtained are all provenance data Web Data Access Schema Mappings Identity Resolution Quality Evaluator
  • 26. >Linked Open Provenance Architecture
  • 28. >Implementation of PGA Provenance Gathering Agent RDF Triple Triple StoreTriple Store Provenance Data Staging DatabaseStaging Database
  • 29. >Implementation of PGA The andThe PGA wraps the ETL process and stores provenance in data staging tables to be further extracted, RDF Triple Triple StoreTriple Store Provenance Data Staging DatabaseStaging Database tables to be further extracted, triplified and loaded to the triple store by other specific steps, developed through Kettle API and Linked Open Data frameworks
  • 30. >Implementation of PGA Web Data Access Schema MappingsSchema Mappings Identity Resolution Provenance Gathering Agent was implemented as a web service written in Scala (www.scala-lang.org) Provenance Gathering Agent was implemented as a web service written in Scala (www.scala-lang.org)
  • 32. >Use Case Scenario CNPq = Brazilian governmental organization responsible for fostering scientific research RNP = Brazilian governmental organization that finances research projects
  • 33. >Use Case Scenario – First Part
  • 34. >Use Case Scenario – First Part
  • 35. > SELECT ?group_name ?project_name ?researcher_uri ?process_name FROM NAMED <http://linkgraph.provenance.br> FROM NAMED <http://datagraph.provenance.br> FROM NAMED <http://www.cnpq.br> FROM NAMED <http://lattes.cnpq.br> WHERE { GRAPH <http://linkgraph.provenance.br> { ?row_uri provprop:cnpqResearchGroup ?group_uri . ?row_uri provprop:lattesProject ?project_uri . ?row_uri provprop:lattesResearcher ?researcher_uri . } GRAPH <http://datagraph.provenance.br> { Gets researcher’s groups, projects and researchers from data graphs of domain dataset Querying Linked Open Provenance GRAPH <http://datagraph.provenance.br> { ?row_uri opmv:wasGeneratedBy ?process_uri . ?process_uri provprop:composition ?process_def_uri . ?process_def_uri dcterms:title ?process_name . } GRAPH <http://www.cnpq.br> { ?group_uri cnpq:project ?project_uri . ?group_uri foaf:name ?group_name . } GRAPH <http://lattes.cnpq.br> { ?project_uri foaf:name ?project_name . ?researcher_uri foaf:name ?researcher_name . } } Data, that were in differents datasources of the CNPq organization, are now integrated in the Web of Data.
  • 36. >Querying Linked Open Provenance SELECT ?group_name ?project_name ?researcher_uri ?process_name FROM NAMED <http://linkgraph.provenance.br> FROM NAMED <http://datagraph.provenance.br> FROM NAMED <http://www.cnpq.br> FROM NAMED <http://lattes.cnpq.br> WHERE { GRAPH <http://linkgraph.provenance.br> { ?row_uri provprop:cnpqResearchGroup ?group_uri . ?row_uri provprop:lattesProject ?project_uri . ?row_uri provprop:lattesResearcher ?researcher_uri . } GRAPH <http://datagraph.provenance.br> { Also gets the integration process from provenance graphs of Linked Open Provenance dataset GRAPH <http://datagraph.provenance.br> { ?row_uri opmv:wasGeneratedBy ?process_uri . ?process_uri provprop:composition ?process_def_uri . ?process_def_uri dcterms:title ?process_name . } GRAPH <http://www.cnpq.br> { ?group_uri cnpq:project ?project_uri . ?group_uri foaf:name ?group_name . } GRAPH <http://lattes.cnpq.br> { ?project_uri foaf:name ?project_name . ?researcher_uri foaf:name ?researcher_name . } }
  • 37. > group_name project_name research_uri process_name "GRECO - Grupo Engenharia do Conhecimento"@pt "LinkedDataBR - Exposição, compartilhamento e http://lattes.cn pq.br/resourc e/Researcher/ "Merge CNPq Research Groups x Lattes Projects" Querying Linked Open Provenance Conhecimento"@pt compartilhamento e conexão de recursos de dados abertos na Web (Linked Open Data)"@pt e/Researcher/ K4781460T3 x Lattes Projects" "GRECO - Grupo Engenharia do Conhecimento"@pt "Núcleo de Pesquisa de Sistemas Computacionais Complexos para a Gestão de Emergências"@pt http://lattes.cn pq.br/resourc e/Researcher/ K4717449A7 "Merge CNPq Research Groups x Lattes Projects" "GRECO - Grupo Engenharia do Conhecimento"@pt "Identificação e Análise de Redes Sociais Complexas"@pt http://lattes.cn pq.br/resourc e/Researcher/ K4761314U5 "Merge CNPq Research Groups x Lattes Projects"
  • 38. >Use Case Scenario – Second Part
  • 39. >Use Case Scenario – Second Part
  • 40. >Use Case Scenario – Provenance Evaluation At the end of the execution of both processes, a SPARQL query could be used to ask: “At which projects does a researcher work?” The result would include projects declared in the CNPq dataset and in the RNP datasetdataset and in the RNP dataset If the projects returned by CNPq diverges from RNP, it is possible to investigate the cause by querying and evaluating LOP data
  • 41. >Conclusion - Contributions New strategy to provide provenance for data and links of Web of Data LOD cycle is extended with a systematic data preparation and transformation process, supported by an ETL workflow frameworkan ETL workflow framework Provenance data is available according to LOD principles (Linked Open Provenance)
  • 42. >Conclusion – Future works Development of provenance query interface – Take advantage of LOP and support its exploration Development / evolution of a provenance ontology – Today, we are using a combination of vocabularies Investigation in the area of Big Data – Fine-grained provenance generates large volumes of data
  • 43. >Thank You ! LOP – Capturing and Linking Open Provenance on LOD Cycle Rogers R. de Mendonça 1 rogers@ufrj.br Jonas F. S. M. De La Cerda 2 jonas.ferreira@uniriotec.br Kelli F. de Cordeiro 1 kelli@ufrj.br Sérgio M. S. da Cruz 3 serra@ufrrj.br Maria Cláudia Cavalcanti 2 yoko@ime.eb.br Maria Luiza M. Campos 1 mluiza@ppgi.ufrj.br 1 Federal University of Rio de Janeiro - UFRJ 2 Military Institute of Engineering - IME 3 Federal Rural University of Rio de Janeiro - UFRRJ