SlideShare a Scribd company logo
1 of 28
Gagg: A Graph Aggregation Operator
June 2nd 2015
Fadi Maali*, Stephane Campinas, Stefan Decker
ESWC2015
* Funded by the Irish Research Council
The Famous LOD Cloud
1/20
http://lod-cloud.net/
The Famous LOD Cloud - COLOURED
2/20
http://lod-cloud.net/
The Famous LOD Cloud
(from a different Angel)
3/20
The Famous LOD Cloud
(from a different Angel)
4/20
The Famous LOD Cloud
(from a different Angel)
5/20
The Famous LOD Cloud
(from a different Angel)
6/20
Graph Aggregation
Condenses a large graph into a structurally
similar but smaller graph by collapsing vertices
and edges
Graph Aggregation - Schema Discovery
8/20
Introducing RDF Graph Summary with application to Assisted SPARQL Formulation
Graph Aggregation - Requirements
9/20
:linkset
:dbpedia :bbc-music
:crossdomain
23k
1.2b
20m
triples
triples
triples
:open-license
:media
:cc-by-sa
:closed-license
:bbc-terms
subject
subject
licenselicense
subjectsTarget objectsTarget
Graph Aggregation Methods
1. Custom Code
error prone, time, efficiency…
2. SPARQL
error prone, time, efficiency…
3. Graph Databases
expressivity, optimisation…
4. Gagg, a first-class operator
Graph Aggregation Methods
1. Custom Code
error prone, time, efficiency…
2. SPARQL
error prone, time, efficiency…
3. Graph Databases
expressivity, optimisation…
4. Gagg, a first-class operator
Operational Semantics
In-memory evaluation algorithm
Experimental evaluation
Gagg: Two-steps Aggregation
11/20
Gagg: Two-steps Aggregation
11/20
● Relation & measure
● Subject dimension(s)
& measure
● Object dimension(s) &
measure
Uses aggregation
functions and a template
similar to CONSTRUCT
queries
Graph Aggregation - Requirements
12/20
:linkset
:dbpedia :bbc-music
:crossdomain
23k
1.2b
20m
triples
triples
triples
:open-license
:media
:cc-by-sa
:closed-license
:bbc-terms
subject
subject
licenselicense
subjectsTarget objectsTarget
measure
relation
Graph Aggregation - Requirements
12/20
:linkset
:dbpedia :bbc-music
:crossdomain
23k
1.2b
20m
triples
triples
triples
:open-license
:media
:cc-by-sa
:closed-license
:bbc-terms
subject
subject
licenselicense
subjectsTarget objectsTarget
measure
relation
?l a void:LinkSet ;
void:subjectsTarget ?s ;
void:objectsTarget ?o ;
void:triples ?m .
Graph Aggregation - Requirements
13/20
:linkset
:dbpedia :bbc-music
:crossdomain
23k
1.2b
20m
triples
triples
triples
:open-license
:media
:cc-by-sa
:closed-license
:bbc-terms
subject
subject
licenselicense
subjectsTarget objectsTarget
Graph Aggregation - Requirements
13/20
:linkset
:dbpedia :bbc-music
:crossdomain
23k
1.2b
20m
triples
triples
triples
:open-license
:media
:cc-by-sa
:closed-license
:bbc-terms
subject
subject
licenselicense
subjectsTarget objectsTarget
?s dct:subject ?sd ;
void:triple ?sm .
Graph Aggregation - Requirements
14/20
:linkset
:dbpedia :bbc-music
:crossdomain
23k
1.2b
20m
triples
triples
triples
:open-license
:media
:cc-by-sa
:closed-license
:bbc-terms
subject
subject
licenselicense
subjectsTarget objectsTarget
Graph Aggregation - Requirements
14/20
:linkset
:dbpedia :bbc-music
:crossdomain
23k
1.2b
20m
triples
triples
triples
:open-license
:media
:cc-by-sa
:closed-license
:bbc-terms
subject
subject
licenselicense
subjectsTarget objectsTarget
?o dct:subject ?od ;
void:triple ?om .
Graph Aggregation - Definition
15/20
Q=(D,M,E,N,R,f)
D: subject dimensions
M: subject measure
E: object dimensions
N: object measure
R: relation query
f: reduce function
?x ?sd
?x ?sm
?y ?od
?y ?om
?x ?m?p ?y
Graph Aggregation - Grouped Graph
16/20
crossdomain media
10k 33k 3k 1.2M ......
linksTo
Graph Aggregation - Evaluation
17/20
● Build a binding table
● Build the Grouped Graph
O(|B|) algorithm where B is the size of the binding
table
● Apply the reduction function
?x ?m?p ?y ?sd ?sm ?od ?om
Graph Aggregation - Experiment Setup
18/20
● Extended In-memory Apache Jena
using SSE
● BSBM and SP2B
● Type Summary and Bibliometrics
Graph Aggregation - Experiment Results
19/20
Data size
(#triples)
fullSPARQL 3SPARQLs reduced Gagg
5k 0.08 0.06 0.01 0.03
190k 9.84 1.25 0.42 0.55
370k 31.88 2.82 1.00 1.13
1.8M 454.07 13.48 4.37 5.61
Type Summary on BSBM data
Graph aggregation as a first-class operator
- Easier for users to express
- Easier for engines to support and optimise
- Easier for further research and study
Further Questions:
- Syntax and effect on SPARQL
- Distributed implementation
Conclusion
20/20
PREFIX : <http://example.org/>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-
ns#>
CONSTRUCT {
_:b0 a ?t1; :count COUNT(?sub.s) .
_:b1 a ?t2; :count COUNT(?obj.o).
_:b2 a rdf:Statement; rdf:predicate ?p; rdf:subject
_:b0; rdf:object _:b1; :count ?prop_count
} WHERE {
GRAPH_AGGREGATION {
?s ?p ?o
{?s a ?t1} GROUP BY ?t1 AS ?sub
{?o a ?t2} GROUP BY ?t2 AS ?obj
}
}
SELECT ?t1 ?count_s ?subId ?t2 ?count_o ?objId ?p (COUNT(*) AS
?rel_count){
{
SELECT ?t1 ?count_s ?subId ?t2 ?count_o ?objId ?p {
?s a ?t1 . ?s ?p ?o . ?o a ?t2 .
{
SELECT ?t1 ?subId (COUNT(DISTINCT ?s) AS ?count_s){
?s a ?t1 . ?s ?p ?o .?o a ?t2 .
BIND (iri(CONCAT(str(?t1), "_s")) AS ?subId)
} GROUP BY ?t1 ?subId
}
{
SELECT ?t2 ?objId (COUNT(DISTINCT ?t2) AS ?count_o){
?s a ?t1 .
?s ?p ?o .
?o a ?t2 .
BIND (iri(CONCAT(str(?t2), "_o")) AS ?objId)
} GROUP BY ?t2 ?objId
}
}
}
} GROUP BY ?t1 ?count_s ?t2 ?count_o ?subId ?objId
}

More Related Content

What's hot

Static model development
Static model developmentStatic model development
Static model development
Kunal Rathod
 
Faga C Map Bosc2008
Faga C Map Bosc2008Faga C Map Bosc2008
Faga C Map Bosc2008
bosc_2008
 

What's hot (20)

Big Data Technology
Big Data TechnologyBig Data Technology
Big Data Technology
 
Large-Margin Multiple Kernel Learning for Discriminative Features Selection a...
Large-Margin Multiple Kernel Learning for Discriminative Features Selection a...Large-Margin Multiple Kernel Learning for Discriminative Features Selection a...
Large-Margin Multiple Kernel Learning for Discriminative Features Selection a...
 
33734947 sap-pp-tables
33734947 sap-pp-tables33734947 sap-pp-tables
33734947 sap-pp-tables
 
Apache Spark™ is here to stay
Apache Spark™ is here to stayApache Spark™ is here to stay
Apache Spark™ is here to stay
 
The Many Uses of FME at PNM
The Many Uses of FME at PNMThe Many Uses of FME at PNM
The Many Uses of FME at PNM
 
BDE SC3.3 Workshop - BDE Platform: Technical overview
 BDE SC3.3 Workshop -  BDE Platform: Technical overview BDE SC3.3 Workshop -  BDE Platform: Technical overview
BDE SC3.3 Workshop - BDE Platform: Technical overview
 
2012 02-08 autodesk infrastructure event in stockholm-multiconsult projects
2012 02-08 autodesk infrastructure event in stockholm-multiconsult projects2012 02-08 autodesk infrastructure event in stockholm-multiconsult projects
2012 02-08 autodesk infrastructure event in stockholm-multiconsult projects
 
Network Rail - Esri UK Annual Conference 2016
Network Rail - Esri UK Annual Conference 2016Network Rail - Esri UK Annual Conference 2016
Network Rail - Esri UK Annual Conference 2016
 
Modular Multi-Objective Genetic Algorithm for Large Scale Bi-level Problems
Modular Multi-Objective Genetic Algorithm for Large Scale Bi-level ProblemsModular Multi-Objective Genetic Algorithm for Large Scale Bi-level Problems
Modular Multi-Objective Genetic Algorithm for Large Scale Bi-level Problems
 
BDE-BDVA Webinar: BDE Technical Overview
BDE-BDVA Webinar: BDE Technical OverviewBDE-BDVA Webinar: BDE Technical Overview
BDE-BDVA Webinar: BDE Technical Overview
 
Multi-objective Genetic Algorithm Applied to Conceptual Design of Single-stag...
Multi-objective Genetic Algorithm Applied to Conceptual Design of Single-stag...Multi-objective Genetic Algorithm Applied to Conceptual Design of Single-stag...
Multi-objective Genetic Algorithm Applied to Conceptual Design of Single-stag...
 
Partitioning SKA Dataflows for Optimal Graph Execution
Partitioning SKA Dataflows for Optimal Graph ExecutionPartitioning SKA Dataflows for Optimal Graph Execution
Partitioning SKA Dataflows for Optimal Graph Execution
 
Spark Summit EU talk by Chris Pool and Jeroen Vlek
Spark Summit EU talk by Chris Pool and Jeroen Vlek Spark Summit EU talk by Chris Pool and Jeroen Vlek
Spark Summit EU talk by Chris Pool and Jeroen Vlek
 
Tivoli Common Reporting and Cognos - Customer Case
Tivoli Common Reporting and Cognos - Customer CaseTivoli Common Reporting and Cognos - Customer Case
Tivoli Common Reporting and Cognos - Customer Case
 
위성이미지 객체 검출 대회 - 2등
위성이미지 객체 검출 대회 - 2등위성이미지 객체 검출 대회 - 2등
위성이미지 객체 검출 대회 - 2등
 
Static model development
Static model developmentStatic model development
Static model development
 
Hello cloud 3
Hello  cloud 3Hello  cloud 3
Hello cloud 3
 
h5web: a web-based viewer of HDF5 files
h5web: a web-based viewer of HDF5 filesh5web: a web-based viewer of HDF5 files
h5web: a web-based viewer of HDF5 files
 
Faga C Map Bosc2008
Faga C Map Bosc2008Faga C Map Bosc2008
Faga C Map Bosc2008
 
Big Data Processing in Pharo
Big Data Processing in PharoBig Data Processing in Pharo
Big Data Processing in Pharo
 

Similar to Gagg: A graph Aggregation Operator

GCD-FPGA-Based-DesignE
GCD-FPGA-Based-DesignEGCD-FPGA-Based-DesignE
GCD-FPGA-Based-DesignE
Ibrahim Hejab
 
ggplot2.SparkR: Rebooting ggplot2 for Scalable Big Data Visualization by Jong...
ggplot2.SparkR: Rebooting ggplot2 for Scalable Big Data Visualization by Jong...ggplot2.SparkR: Rebooting ggplot2 for Scalable Big Data Visualization by Jong...
ggplot2.SparkR: Rebooting ggplot2 for Scalable Big Data Visualization by Jong...
Spark Summit
 
Scrap Your MapReduce - Apache Spark
 Scrap Your MapReduce - Apache Spark Scrap Your MapReduce - Apache Spark
Scrap Your MapReduce - Apache Spark
IndicThreads
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
Connected Data World
 

Similar to Gagg: A graph Aggregation Operator (20)

LinkedGeoData and GeoKnow
LinkedGeoData and GeoKnowLinkedGeoData and GeoKnow
LinkedGeoData and GeoKnow
 
glTF Update with Tony Parisi WebGL Meetup August 2013
glTF Update with Tony Parisi WebGL Meetup August 2013glTF Update with Tony Parisi WebGL Meetup August 2013
glTF Update with Tony Parisi WebGL Meetup August 2013
 
Apache spark linkedin
Apache spark linkedinApache spark linkedin
Apache spark linkedin
 
Open layers
Open layersOpen layers
Open layers
 
GCD-FPGA-Based-DesignE
GCD-FPGA-Based-DesignEGCD-FPGA-Based-DesignE
GCD-FPGA-Based-DesignE
 
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
 
MapServer #ProTips 2015
MapServer #ProTips 2015MapServer #ProTips 2015
MapServer #ProTips 2015
 
mago3D FOSS4G NA 2018
mago3D FOSS4G NA 2018mago3D FOSS4G NA 2018
mago3D FOSS4G NA 2018
 
Graphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphXGraphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphX
 
Let's integrate CAD/BIM/GIS on the same platform: A practical approach in rea...
Let's integrate CAD/BIM/GIS on the same platform: A practical approach in rea...Let's integrate CAD/BIM/GIS on the same platform: A practical approach in rea...
Let's integrate CAD/BIM/GIS on the same platform: A practical approach in rea...
 
WMS Performance Shootout 2011
WMS Performance Shootout 2011WMS Performance Shootout 2011
WMS Performance Shootout 2011
 
DUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORM
DUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORMDUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORM
DUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORM
 
ggplot2.SparkR: Rebooting ggplot2 for Scalable Big Data Visualization by Jong...
ggplot2.SparkR: Rebooting ggplot2 for Scalable Big Data Visualization by Jong...ggplot2.SparkR: Rebooting ggplot2 for Scalable Big Data Visualization by Jong...
ggplot2.SparkR: Rebooting ggplot2 for Scalable Big Data Visualization by Jong...
 
Scrap Your MapReduce - Apache Spark
 Scrap Your MapReduce - Apache Spark Scrap Your MapReduce - Apache Spark
Scrap Your MapReduce - Apache Spark
 
State of GeoServer 2.10
State of GeoServer 2.10State of GeoServer 2.10
State of GeoServer 2.10
 
Linked Data (in low-resource) Platforms: a mapping for Constrained Applicatio...
Linked Data (in low-resource) Platforms: a mapping for Constrained Applicatio...Linked Data (in low-resource) Platforms: a mapping for Constrained Applicatio...
Linked Data (in low-resource) Platforms: a mapping for Constrained Applicatio...
 
Bn26425431
Bn26425431Bn26425431
Bn26425431
 
RVC: A Multi-Decoder CAL Composer Tool
RVC: A Multi-Decoder CAL Composer ToolRVC: A Multi-Decoder CAL Composer Tool
RVC: A Multi-Decoder CAL Composer Tool
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
 
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...
 

More from Fadi Maali (8)

Towards an RDF Analytics Language: Learning from Successful Experiences
Towards an RDF Analytics Language: Learning from Successful ExperiencesTowards an RDF Analytics Language: Learning from Successful Experiences
Towards an RDF Analytics Language: Learning from Successful Experiences
 
RDF Analytics... SPARQL and Beyond
RDF Analytics... SPARQL and BeyondRDF Analytics... SPARQL and Beyond
RDF Analytics... SPARQL and Beyond
 
Linked Data lifecycle
Linked Data lifecycleLinked Data lifecycle
Linked Data lifecycle
 
Self-service Linked Government Data
Self-service Linked Government DataSelf-service Linked Government Data
Self-service Linked Government Data
 
Dcat - Machine Accessible Data Catalogues
Dcat - Machine Accessible Data CataloguesDcat - Machine Accessible Data Catalogues
Dcat - Machine Accessible Data Catalogues
 
Open data showcase
Open data showcaseOpen data showcase
Open data showcase
 
Employing Google Refine to publish Linked Data
Employing Google Refine to publish Linked DataEmploying Google Refine to publish Linked Data
Employing Google Refine to publish Linked Data
 
Government data catalogues interoperability
Government data catalogues interoperabilityGovernment data catalogues interoperability
Government data catalogues interoperability
 

Recently uploaded

Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
gajnagarg
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
gajnagarg
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
SayantanBiswas37
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 

Recently uploaded (20)

Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 

Gagg: A graph Aggregation Operator

  • 1. Gagg: A Graph Aggregation Operator June 2nd 2015 Fadi Maali*, Stephane Campinas, Stefan Decker ESWC2015 * Funded by the Irish Research Council
  • 2. The Famous LOD Cloud 1/20 http://lod-cloud.net/
  • 3. The Famous LOD Cloud - COLOURED 2/20 http://lod-cloud.net/
  • 4. The Famous LOD Cloud (from a different Angel) 3/20
  • 5. The Famous LOD Cloud (from a different Angel) 4/20
  • 6. The Famous LOD Cloud (from a different Angel) 5/20
  • 7. The Famous LOD Cloud (from a different Angel) 6/20
  • 8. Graph Aggregation Condenses a large graph into a structurally similar but smaller graph by collapsing vertices and edges
  • 9. Graph Aggregation - Schema Discovery 8/20 Introducing RDF Graph Summary with application to Assisted SPARQL Formulation
  • 10. Graph Aggregation - Requirements 9/20 :linkset :dbpedia :bbc-music :crossdomain 23k 1.2b 20m triples triples triples :open-license :media :cc-by-sa :closed-license :bbc-terms subject subject licenselicense subjectsTarget objectsTarget
  • 11. Graph Aggregation Methods 1. Custom Code error prone, time, efficiency… 2. SPARQL error prone, time, efficiency… 3. Graph Databases expressivity, optimisation… 4. Gagg, a first-class operator
  • 12. Graph Aggregation Methods 1. Custom Code error prone, time, efficiency… 2. SPARQL error prone, time, efficiency… 3. Graph Databases expressivity, optimisation… 4. Gagg, a first-class operator Operational Semantics In-memory evaluation algorithm Experimental evaluation
  • 14. Gagg: Two-steps Aggregation 11/20 ● Relation & measure ● Subject dimension(s) & measure ● Object dimension(s) & measure Uses aggregation functions and a template similar to CONSTRUCT queries
  • 15. Graph Aggregation - Requirements 12/20 :linkset :dbpedia :bbc-music :crossdomain 23k 1.2b 20m triples triples triples :open-license :media :cc-by-sa :closed-license :bbc-terms subject subject licenselicense subjectsTarget objectsTarget measure relation
  • 16. Graph Aggregation - Requirements 12/20 :linkset :dbpedia :bbc-music :crossdomain 23k 1.2b 20m triples triples triples :open-license :media :cc-by-sa :closed-license :bbc-terms subject subject licenselicense subjectsTarget objectsTarget measure relation ?l a void:LinkSet ; void:subjectsTarget ?s ; void:objectsTarget ?o ; void:triples ?m .
  • 17. Graph Aggregation - Requirements 13/20 :linkset :dbpedia :bbc-music :crossdomain 23k 1.2b 20m triples triples triples :open-license :media :cc-by-sa :closed-license :bbc-terms subject subject licenselicense subjectsTarget objectsTarget
  • 18. Graph Aggregation - Requirements 13/20 :linkset :dbpedia :bbc-music :crossdomain 23k 1.2b 20m triples triples triples :open-license :media :cc-by-sa :closed-license :bbc-terms subject subject licenselicense subjectsTarget objectsTarget ?s dct:subject ?sd ; void:triple ?sm .
  • 19. Graph Aggregation - Requirements 14/20 :linkset :dbpedia :bbc-music :crossdomain 23k 1.2b 20m triples triples triples :open-license :media :cc-by-sa :closed-license :bbc-terms subject subject licenselicense subjectsTarget objectsTarget
  • 20. Graph Aggregation - Requirements 14/20 :linkset :dbpedia :bbc-music :crossdomain 23k 1.2b 20m triples triples triples :open-license :media :cc-by-sa :closed-license :bbc-terms subject subject licenselicense subjectsTarget objectsTarget ?o dct:subject ?od ; void:triple ?om .
  • 21. Graph Aggregation - Definition 15/20 Q=(D,M,E,N,R,f) D: subject dimensions M: subject measure E: object dimensions N: object measure R: relation query f: reduce function ?x ?sd ?x ?sm ?y ?od ?y ?om ?x ?m?p ?y
  • 22. Graph Aggregation - Grouped Graph 16/20 crossdomain media 10k 33k 3k 1.2M ...... linksTo
  • 23. Graph Aggregation - Evaluation 17/20 ● Build a binding table ● Build the Grouped Graph O(|B|) algorithm where B is the size of the binding table ● Apply the reduction function ?x ?m?p ?y ?sd ?sm ?od ?om
  • 24. Graph Aggregation - Experiment Setup 18/20 ● Extended In-memory Apache Jena using SSE ● BSBM and SP2B ● Type Summary and Bibliometrics
  • 25. Graph Aggregation - Experiment Results 19/20 Data size (#triples) fullSPARQL 3SPARQLs reduced Gagg 5k 0.08 0.06 0.01 0.03 190k 9.84 1.25 0.42 0.55 370k 31.88 2.82 1.00 1.13 1.8M 454.07 13.48 4.37 5.61 Type Summary on BSBM data
  • 26. Graph aggregation as a first-class operator - Easier for users to express - Easier for engines to support and optimise - Easier for further research and study Further Questions: - Syntax and effect on SPARQL - Distributed implementation Conclusion 20/20
  • 27. PREFIX : <http://example.org/> PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax- ns#> CONSTRUCT { _:b0 a ?t1; :count COUNT(?sub.s) . _:b1 a ?t2; :count COUNT(?obj.o). _:b2 a rdf:Statement; rdf:predicate ?p; rdf:subject _:b0; rdf:object _:b1; :count ?prop_count } WHERE { GRAPH_AGGREGATION { ?s ?p ?o {?s a ?t1} GROUP BY ?t1 AS ?sub {?o a ?t2} GROUP BY ?t2 AS ?obj } }
  • 28. SELECT ?t1 ?count_s ?subId ?t2 ?count_o ?objId ?p (COUNT(*) AS ?rel_count){ { SELECT ?t1 ?count_s ?subId ?t2 ?count_o ?objId ?p { ?s a ?t1 . ?s ?p ?o . ?o a ?t2 . { SELECT ?t1 ?subId (COUNT(DISTINCT ?s) AS ?count_s){ ?s a ?t1 . ?s ?p ?o .?o a ?t2 . BIND (iri(CONCAT(str(?t1), "_s")) AS ?subId) } GROUP BY ?t1 ?subId } { SELECT ?t2 ?objId (COUNT(DISTINCT ?t2) AS ?count_o){ ?s a ?t1 . ?s ?p ?o . ?o a ?t2 . BIND (iri(CONCAT(str(?t2), "_o")) AS ?objId) } GROUP BY ?t2 ?objId } } } } GROUP BY ?t1 ?count_s ?t2 ?count_o ?subId ?objId }