SlideShare uma empresa Scribd logo
1 de 30
BigDataEurope - Supporting the
Variety Dimension of Big Data
Mohamed Nadjib MAMI - Fraunhofer IAISICWE17 - 06.06.2017
Big Data Europe - the Project
◎ EU Horizon 2020-programme-funded
◎ Coordination & Support action (CSA) Project
o Show societal value of Big data to 7 Domains
o Lower barrier for using Big Data technologies
=> BigDataEurope Platform
2
Consortium Partners
3
Consortium of 17 Partners
o Industry, SMEs, universities, research institutes, etc.
BDE Europe - The Platform
◎ Integrator of Big Data technologies
o Easy to use/get started (plug-and-play)
o Flexible, Customisable
◎ Bundles with only Open Source solutions
o Data Storage
o Message Passing
o Data Processing
o Data Searching & Publishing
◎ Publicly released in May 2017
4
BDE Platform - Components (some)
Search/Indexing Data Processing
Apache Solr Apache Spark
Elasticsearch Apache Flink
Data Acquisition Semantic Components
Apache Flume Strabon
Message Passing Sextant
Apache Kafka GeoTriples
Data Storage Silk
Apache Hadoop SEMAGROW
Apache Cassandra LIMES
Apache Hive 4Store
Postgis OpenLink Virtuoso
5
BDE Platform - Architecture
Support Layer
Init Daemon
GUIs
Base Setup
App Layer
Traffic
Forecast
Satellite Image
Analysis
Platform Layer
Spark Flink Semantic Layer
Ontario SANSA Semagrow
Kafka
Real-time Stream
Monitoring
...
...
Resource Management Layer (Swarm)
Hardware Layer
Premises Cloud (AWS, GCE, MS Azure, …)
Data Layer
Hadoop NOSQL Store CassandraElasticsearch ...RDF Store
Semantic Data Lake (Unified View)
6
BDE Platform - Hardware & Virtualization
◎ Docker used for packaging and deploying applications
◎ Based on containers:
o A lightweight environment to make a piece of
software run in isolation
❖ Shares the host operating system kernel (unlike
VMs)
❖ Reduces conflicts e.g., versions
◎ Docker Compose: creates multi-container applications
7
BDE Platform - Resource Managements
◎ Swarm (mode) used for managing, scheduling and
orchestrating Dockers in multi-node clusters
◎ It provides:
o Scalability and Fault Tolerance
o Containers interlinking
o Log-based monitoring
◎ Separate hardware from software management
◎ Based on Services
o Swarm execution unit running a Docker Image
8
BDE Platform - Support Layer
◎ Init Daemon: orchestrates the initialization process of
the components (containers of Docker Compose):
o Components report their initialization progress
o It validates whether a specific component can start
o It specifies the dependencies between services
o It Indicates where a human interaction is required
◎ Examples:
o Wait data to load to HDFS to start a Spark job
o Wait Spark Master to successfully start to start a Worker
9
BDE Platform - User Interfaces
10
Component 1
Component 2
Component 3
Pipeline Builder: creates step-by-step dependency
pipeline (fed to the init daemon)
BDE Platform - User Interfaces
11
Component 1
Finished
Component 2
Finished
Component 3
Inprogress
Pipeline Monitor: displays the status (not started, running or finished) of
components in a running pipeline (retrieved from the init daemon)
BDE Platform - User Interfaces
12
Swarm UI: allows to clone a Git repository containing a
pipeline and deploys/controls/monitors it on Swarm
BDE Platform - User Interfaces
13
Integrator UI: displays the dashboard of each running
component in a unified interface
BDE Platform - Semantic Layer > Ontario
◎ Data Lake or Swamp?
o Repository of data in its original formats
o Structured, semi-structured, unstructured
o Without unified schema
◎ Semantic Data Lake (Ontario)
o Add a Semantic Layer on top of the source datasets
❖ The data is semantically lifted using ontology
terms
❖ Provide a uniform view over nonuniform data
14
BDE Platform - Semantic Layer > Ontario
15
SELECT count(distinct(?publication))
AS ?no_of_publications
count(?deaths) AS ?no_of_deaths
WHERE {
?item a qb:Observation .
?item gho:Country ?country .
?item gho:Disease ?disease .
?item att:unitMeasure gho:Measure .
?item eg:incidence ?deaths .
?country rdfs:label "India" .
?disease rdfs:label "Tuberculosis".
?trial a ct:trials .
?trial ct:condition ?condition .
?trial ct:location ?location .
?trial ct:reference ?publication.
?condition owl:sameAs ?disease .
?location redd:locatedIn ?country .
?publication ct:citation ?citation.
}
?item a qb:Observation .
?item gho:Country ?country .
?item gho:Disease ?disease .
?item att:unitMeasure gho:Measure .
?item eg:incidence ?deaths .
?trial a ct:trials .
?trial ct:condition ?condition .
?trial ct:location ?location .
?trial ct:reference ?publication.
?condition owl:sameAs ?disease .
?disease rdfs:label "Tuberculosis".
?country rdfs:label "India" .
?location redd:locatedIn ?country .
?publication ct:citation ?citation.
Query “number of distinct publications and number of
distinct deaths due to the disease Tuberculosis in India”
BDE Platform - Semantic Layer > Ontario
16
Publications
Meta-wrapper
Trials
Meta-wrapper Conditions
Meta-wrapper
Observations
Meta-wrapper
2. Planning
3. Meta-wrapper
invocation
Query 1. Query Parsing
& Validation
BDE Platform - Semantic Layer > Ontario
17
Publications
Meta-wrapper
Observations
Meta-wrapper
Trials
Meta-wrapper
Wrapper (XML) Wrapper (CSV)
Conditions
Meta-wrapper
Wrapper (RDF)
4. Wrapper
Selection &
Query
Translation
?item gho:Country ?country .
?item gho:Disease ?disease .
...
SELECT country, disease, ...
FROM Observations
Mapping rules
...
[Xpath]
...
...
[Sparql]
...
5. Query
Execution
...
[Sparql]
...
BDE Platform - Semantic Layer > SANSA
18
SANSA a Framework for distributed RDF
data processing
◎ Read/write Layer: Read and write
native RDF/OWL data in distributed
storage e.g., Hadoop, Spark (RDD,
DataFrames, GraphX), Tensors
following different representations &
partitioning scheme e.g., graphs, tables
◎ Querying Layer: Query distributed
RDF using SPARQL (SPARQL-to-SQL
approaches, Virtual Views, Intelligent
Indexing, ...)
http://sansa-stack.net
BDE Platform - Semantic Layer > SANSA
19
http://sansa-stack.net
◎ Inference Layer: Derive new facts from
existing ones, detect inconsistencies,
extract new rules to help in reasoning
◎ Machine Learning Layer: Perform ML
or analytics to gain insights for relevant
trends, predictions or detection of
anomalies from RDF data
o Tensor Factorization for e.g. KB
completion (testing stage)
o Graph Clustering (testing stage)
o Association rule mining (evaluation stage)
o Semantic Decision trees (idea stage)
o Inference in Knowledge Graph
Embeddings (idea stage)
BDE Platform - Semantic Layer > Semagrow
Semagrow a SPARQL query processing system that federates
multiple remote endpoints
◎ Original Semagrow
o Optimizes queries transparently
o Executes sub-queries in the remote endpoints
o Integrates results dynamically in heterogeneous data
models
o Joins the partial results into the final query answer
◎ Next-gen Semagrow
o Support different querying languages
o Query planner and execution engine adapted
e.g., translate SPARQL to CQL for Cassandra
databases
20
BDE Showcases (pilots)
21
SC1 SC2 SC3 SC4 SC5 SC6
SC7
SC1 - Open PHACTS discovery platform relating to biological/medical questions
SC2 - Discovery and Linking of Viticulture-relevant information
SC3 - System monitoring in energy production units
SC4 - Short-Term traffic flow forecasting.
SC5 - Supporting data-intensive climate research
SC6 - Citizens & Researchers Budget on Municipal Level
SC7 - Ingestion of remote sensing images and social sensing data to detect and verify
changes on the Earth surface for security applications
◎ 7 Societal Challenges > 7 pilot implementations
Showcase SC1: Health, demographic
change and wellbeing
◎ SC1 Implements Open PHACTS Discovery Platform
o Integrates and links data from multiple sources:
ChEBI, ChEMBL, the Gene Ontology and UniProt
(Chemistry, Biological, Medical, etc.)
o Explores the relationships between data
(compounds, targets, pathways, diseases and
tissues)
o Data accessed using RESTful-API requests
❖ Translated to SPARQL queries
◎ Technologies used:
o 4Store, Memchached, MySQL, Puelia, SWAGGER
22
Showcase SC7: Secure Societies
◎ Detect changes in land cover in satellite images (e.g.,
monitoring critical infrastructures)
◎ Display geo-located events in news sites and social
media (e.g., news articles, social networks)
◎ Three workflows:
o Change detection workflow
o Event detection workflow
o Activation workflow
◎ Technologies used: Apache Spark, Cassandra,
Sextant, Semagrow, Strabon, GeoTriples
23
Showcase 2 (SC7): Secure Societies
24
General Architecture of the SC7 Pilot
Showcase 2 (SC7): Secure Societies
area and the time
interval of interest
Satellite Images Compare Images
Change detection workflow
25
Showcase 2 (SC7): Secure Societies
Event detection workflow
Associate names
with coordinates
Cluster news into events
(associate geo-location)
26
Showcase 2 (SC7): Secure Societies
Activation detection workflow
Areas with changes
Summary of events
Spatiotemporal
RDF store
27
Showcase 2 (SC7): Secure Societies
refugee camps located in Zaatari, Jordan
28
News
TweetsSelected
Area
Detected
changes
Thanks & Questions?
For more info...
◎ Project-related: Simon Scerri (scerri@cs.uni-bonn.de)
◎ Ontario: Mohamed Nadjib Mami (mami@cs.uni-bonn.de)
◎ SANSA: Jens Lehmann (jens.lehmann@cs.uni-bonn.de)
◎ Semagrow: Stasinos Konstantopoulos (konstant@iit.demokritos.gr)
◎ Pilots (showcases):
o SC1: Ronald Siebes (rm.siebes@few.vu.nl)
o SC7: George Papadakis (gpapadis@di.uoa.gr)
o All: Ronald Siebes (rm.siebes@few.vu.nl)
◎ Github repos: https://github.com/big-data-europe/README
◎ Website: https://big-data-europe.eu
29
BDE Platform vs. Hadoop Distributions
30
SFR = Single failure recovery
MFR = Multiple failure recovery
SF = Self healing

Mais conteúdo relacionado

Semelhante a ICWE2017 BigDataEurope

BigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE PlatformBigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE PlatformBigData_Europe
 
Big Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open WorkshopBig Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open WorkshopExtremeEarth
 
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...BigData_Europe
 
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...BigData_Europe
 
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...Hajira Jabeen
 
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BigData_Europe
 
WSO2 Big Data Platform and Applications
WSO2 Big Data Platform and ApplicationsWSO2 Big Data Platform and Applications
WSO2 Big Data Platform and ApplicationsSrinath Perera
 
Event Visualization with OpenStreetMap Data, Interdisciplinary Project
Event Visualization with OpenStreetMap Data, Interdisciplinary ProjectEvent Visualization with OpenStreetMap Data, Interdisciplinary Project
Event Visualization with OpenStreetMap Data, Interdisciplinary ProjectBibek Shrestha
 
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...WSO2
 
DEMETER at OGC Agriculture Session
DEMETER at OGC Agriculture SessionDEMETER at OGC Agriculture Session
DEMETER at OGC Agriculture SessionH2020 DEMETER
 
Open Data Hub - Roberto Monsorno - Environmental Data Platform as a tool for ...
Open Data Hub - Roberto Monsorno - Environmental Data Platform as a tool for ...Open Data Hub - Roberto Monsorno - Environmental Data Platform as a tool for ...
Open Data Hub - Roberto Monsorno - Environmental Data Platform as a tool for ...South Tyrol Free Software Conference
 
WLCG Grid Infrastructure Monitoring
WLCG Grid Infrastructure MonitoringWLCG Grid Infrastructure Monitoring
WLCG Grid Infrastructure MonitoringJames Casey
 
BDE SC3.3 Workshop - BDE Platform: Technical overview
 BDE SC3.3 Workshop -  BDE Platform: Technical overview BDE SC3.3 Workshop -  BDE Platform: Technical overview
BDE SC3.3 Workshop - BDE Platform: Technical overviewBigData_Europe
 
BDE-BDVA Webinar: BDE Technical Overview
BDE-BDVA Webinar: BDE Technical OverviewBDE-BDVA Webinar: BDE Technical Overview
BDE-BDVA Webinar: BDE Technical OverviewBigData_Europe
 
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016Ivan Ermilov
 
Ultralight Data Movement for IoT with SDC Edge
Ultralight Data Movement for IoT with SDC EdgeUltralight Data Movement for IoT with SDC Edge
Ultralight Data Movement for IoT with SDC EdgeDataWorks Summit
 
Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)Martin Pinzger
 
BigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal PilotsBigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal PilotsBigData_Europe
 
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshThe Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshIanFurlong4
 
OpenTelemetry Introduction
OpenTelemetry Introduction OpenTelemetry Introduction
OpenTelemetry Introduction DimitrisFinas1
 

Semelhante a ICWE2017 BigDataEurope (20)

BigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE PlatformBigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE Platform
 
Big Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open WorkshopBig Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open Workshop
 
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
 
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
 
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
 
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
 
WSO2 Big Data Platform and Applications
WSO2 Big Data Platform and ApplicationsWSO2 Big Data Platform and Applications
WSO2 Big Data Platform and Applications
 
Event Visualization with OpenStreetMap Data, Interdisciplinary Project
Event Visualization with OpenStreetMap Data, Interdisciplinary ProjectEvent Visualization with OpenStreetMap Data, Interdisciplinary Project
Event Visualization with OpenStreetMap Data, Interdisciplinary Project
 
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
 
DEMETER at OGC Agriculture Session
DEMETER at OGC Agriculture SessionDEMETER at OGC Agriculture Session
DEMETER at OGC Agriculture Session
 
Open Data Hub - Roberto Monsorno - Environmental Data Platform as a tool for ...
Open Data Hub - Roberto Monsorno - Environmental Data Platform as a tool for ...Open Data Hub - Roberto Monsorno - Environmental Data Platform as a tool for ...
Open Data Hub - Roberto Monsorno - Environmental Data Platform as a tool for ...
 
WLCG Grid Infrastructure Monitoring
WLCG Grid Infrastructure MonitoringWLCG Grid Infrastructure Monitoring
WLCG Grid Infrastructure Monitoring
 
BDE SC3.3 Workshop - BDE Platform: Technical overview
 BDE SC3.3 Workshop -  BDE Platform: Technical overview BDE SC3.3 Workshop -  BDE Platform: Technical overview
BDE SC3.3 Workshop - BDE Platform: Technical overview
 
BDE-BDVA Webinar: BDE Technical Overview
BDE-BDVA Webinar: BDE Technical OverviewBDE-BDVA Webinar: BDE Technical Overview
BDE-BDVA Webinar: BDE Technical Overview
 
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
 
Ultralight Data Movement for IoT with SDC Edge
Ultralight Data Movement for IoT with SDC EdgeUltralight Data Movement for IoT with SDC Edge
Ultralight Data Movement for IoT with SDC Edge
 
Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)
 
BigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal PilotsBigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal Pilots
 
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshThe Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
 
OpenTelemetry Introduction
OpenTelemetry Introduction OpenTelemetry Introduction
OpenTelemetry Introduction
 

Mais de BigData_Europe

Luigi Selmi - The Big Data Integrator Platform
Luigi Selmi - The Big Data Integrator PlatformLuigi Selmi - The Big Data Integrator Platform
Luigi Selmi - The Big Data Integrator PlatformBigData_Europe
 
Josep Maria Salanova - Introduction to BDE+SC4
Josep Maria Salanova - Introduction to BDE+SC4Josep Maria Salanova - Introduction to BDE+SC4
Josep Maria Salanova - Introduction to BDE+SC4BigData_Europe
 
Rajendra Akerkar - LeMO Project
Rajendra Akerkar - LeMO ProjectRajendra Akerkar - LeMO Project
Rajendra Akerkar - LeMO ProjectBigData_Europe
 
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...BigData_Europe
 
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...BigData_Europe
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...BigData_Europe
 
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...BigData_Europe
 
BDE SC3.3 Workshop - BDE review: Scope and Opportunities
 BDE SC3.3 Workshop -  BDE review: Scope and Opportunities BDE SC3.3 Workshop -  BDE review: Scope and Opportunities
BDE SC3.3 Workshop - BDE review: Scope and OpportunitiesBigData_Europe
 
BDE SC3.3 Workshop - Agenda
 BDE SC3.3 Workshop - Agenda BDE SC3.3 Workshop - Agenda
BDE SC3.3 Workshop - AgendaBigData_Europe
 
BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re...
 BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re... BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re...
BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re...BigData_Europe
 
BDE SC3.3 Workshop - Data management in WT testing and monitoring
 BDE SC3.3 Workshop - Data management in WT testing and monitoring  BDE SC3.3 Workshop - Data management in WT testing and monitoring
BDE SC3.3 Workshop - Data management in WT testing and monitoring BigData_Europe
 
BDE SC3.3 Workshop - Big Data in Wind Turbine Condition Monitoring
 BDE SC3.3 Workshop -  Big Data in Wind Turbine Condition Monitoring BDE SC3.3 Workshop -  Big Data in Wind Turbine Condition Monitoring
BDE SC3.3 Workshop - Big Data in Wind Turbine Condition MonitoringBigData_Europe
 
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...BigData_Europe
 
BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics
 BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics  BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics
BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics BigData_Europe
 
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...BigData_Europe
 
BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)
BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)
BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)BigData_Europe
 
BDE SC1 Workshop 3 - iASiS (Guillermo Palma)
BDE SC1 Workshop 3 - iASiS (Guillermo Palma)BDE SC1 Workshop 3 - iASiS (Guillermo Palma)
BDE SC1 Workshop 3 - iASiS (Guillermo Palma)BigData_Europe
 
BDE SC1 Workshop 3 - MIDAS (Michaela Black)
BDE SC1 Workshop 3 - MIDAS (Michaela Black)BDE SC1 Workshop 3 - MIDAS (Michaela Black)
BDE SC1 Workshop 3 - MIDAS (Michaela Black)BigData_Europe
 
BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)
BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)
BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)BigData_Europe
 
SC1 Hangout: Updating public databases: Automation and other challenges for c...
SC1 Hangout: Updating public databases: Automation and other challenges for c...SC1 Hangout: Updating public databases: Automation and other challenges for c...
SC1 Hangout: Updating public databases: Automation and other challenges for c...BigData_Europe
 

Mais de BigData_Europe (20)

Luigi Selmi - The Big Data Integrator Platform
Luigi Selmi - The Big Data Integrator PlatformLuigi Selmi - The Big Data Integrator Platform
Luigi Selmi - The Big Data Integrator Platform
 
Josep Maria Salanova - Introduction to BDE+SC4
Josep Maria Salanova - Introduction to BDE+SC4Josep Maria Salanova - Introduction to BDE+SC4
Josep Maria Salanova - Introduction to BDE+SC4
 
Rajendra Akerkar - LeMO Project
Rajendra Akerkar - LeMO ProjectRajendra Akerkar - LeMO Project
Rajendra Akerkar - LeMO Project
 
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
 
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
 
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
 
BDE SC3.3 Workshop - BDE review: Scope and Opportunities
 BDE SC3.3 Workshop -  BDE review: Scope and Opportunities BDE SC3.3 Workshop -  BDE review: Scope and Opportunities
BDE SC3.3 Workshop - BDE review: Scope and Opportunities
 
BDE SC3.3 Workshop - Agenda
 BDE SC3.3 Workshop - Agenda BDE SC3.3 Workshop - Agenda
BDE SC3.3 Workshop - Agenda
 
BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re...
 BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re... BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re...
BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re...
 
BDE SC3.3 Workshop - Data management in WT testing and monitoring
 BDE SC3.3 Workshop - Data management in WT testing and monitoring  BDE SC3.3 Workshop - Data management in WT testing and monitoring
BDE SC3.3 Workshop - Data management in WT testing and monitoring
 
BDE SC3.3 Workshop - Big Data in Wind Turbine Condition Monitoring
 BDE SC3.3 Workshop -  Big Data in Wind Turbine Condition Monitoring BDE SC3.3 Workshop -  Big Data in Wind Turbine Condition Monitoring
BDE SC3.3 Workshop - Big Data in Wind Turbine Condition Monitoring
 
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...
 
BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics
 BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics  BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics
BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics
 
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...
 
BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)
BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)
BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)
 
BDE SC1 Workshop 3 - iASiS (Guillermo Palma)
BDE SC1 Workshop 3 - iASiS (Guillermo Palma)BDE SC1 Workshop 3 - iASiS (Guillermo Palma)
BDE SC1 Workshop 3 - iASiS (Guillermo Palma)
 
BDE SC1 Workshop 3 - MIDAS (Michaela Black)
BDE SC1 Workshop 3 - MIDAS (Michaela Black)BDE SC1 Workshop 3 - MIDAS (Michaela Black)
BDE SC1 Workshop 3 - MIDAS (Michaela Black)
 
BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)
BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)
BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)
 
SC1 Hangout: Updating public databases: Automation and other challenges for c...
SC1 Hangout: Updating public databases: Automation and other challenges for c...SC1 Hangout: Updating public databases: Automation and other challenges for c...
SC1 Hangout: Updating public databases: Automation and other challenges for c...
 

Último

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 

Último (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 

ICWE2017 BigDataEurope

  • 1. BigDataEurope - Supporting the Variety Dimension of Big Data Mohamed Nadjib MAMI - Fraunhofer IAISICWE17 - 06.06.2017
  • 2. Big Data Europe - the Project ◎ EU Horizon 2020-programme-funded ◎ Coordination & Support action (CSA) Project o Show societal value of Big data to 7 Domains o Lower barrier for using Big Data technologies => BigDataEurope Platform 2
  • 3. Consortium Partners 3 Consortium of 17 Partners o Industry, SMEs, universities, research institutes, etc.
  • 4. BDE Europe - The Platform ◎ Integrator of Big Data technologies o Easy to use/get started (plug-and-play) o Flexible, Customisable ◎ Bundles with only Open Source solutions o Data Storage o Message Passing o Data Processing o Data Searching & Publishing ◎ Publicly released in May 2017 4
  • 5. BDE Platform - Components (some) Search/Indexing Data Processing Apache Solr Apache Spark Elasticsearch Apache Flink Data Acquisition Semantic Components Apache Flume Strabon Message Passing Sextant Apache Kafka GeoTriples Data Storage Silk Apache Hadoop SEMAGROW Apache Cassandra LIMES Apache Hive 4Store Postgis OpenLink Virtuoso 5
  • 6. BDE Platform - Architecture Support Layer Init Daemon GUIs Base Setup App Layer Traffic Forecast Satellite Image Analysis Platform Layer Spark Flink Semantic Layer Ontario SANSA Semagrow Kafka Real-time Stream Monitoring ... ... Resource Management Layer (Swarm) Hardware Layer Premises Cloud (AWS, GCE, MS Azure, …) Data Layer Hadoop NOSQL Store CassandraElasticsearch ...RDF Store Semantic Data Lake (Unified View) 6
  • 7. BDE Platform - Hardware & Virtualization ◎ Docker used for packaging and deploying applications ◎ Based on containers: o A lightweight environment to make a piece of software run in isolation ❖ Shares the host operating system kernel (unlike VMs) ❖ Reduces conflicts e.g., versions ◎ Docker Compose: creates multi-container applications 7
  • 8. BDE Platform - Resource Managements ◎ Swarm (mode) used for managing, scheduling and orchestrating Dockers in multi-node clusters ◎ It provides: o Scalability and Fault Tolerance o Containers interlinking o Log-based monitoring ◎ Separate hardware from software management ◎ Based on Services o Swarm execution unit running a Docker Image 8
  • 9. BDE Platform - Support Layer ◎ Init Daemon: orchestrates the initialization process of the components (containers of Docker Compose): o Components report their initialization progress o It validates whether a specific component can start o It specifies the dependencies between services o It Indicates where a human interaction is required ◎ Examples: o Wait data to load to HDFS to start a Spark job o Wait Spark Master to successfully start to start a Worker 9
  • 10. BDE Platform - User Interfaces 10 Component 1 Component 2 Component 3 Pipeline Builder: creates step-by-step dependency pipeline (fed to the init daemon)
  • 11. BDE Platform - User Interfaces 11 Component 1 Finished Component 2 Finished Component 3 Inprogress Pipeline Monitor: displays the status (not started, running or finished) of components in a running pipeline (retrieved from the init daemon)
  • 12. BDE Platform - User Interfaces 12 Swarm UI: allows to clone a Git repository containing a pipeline and deploys/controls/monitors it on Swarm
  • 13. BDE Platform - User Interfaces 13 Integrator UI: displays the dashboard of each running component in a unified interface
  • 14. BDE Platform - Semantic Layer > Ontario ◎ Data Lake or Swamp? o Repository of data in its original formats o Structured, semi-structured, unstructured o Without unified schema ◎ Semantic Data Lake (Ontario) o Add a Semantic Layer on top of the source datasets ❖ The data is semantically lifted using ontology terms ❖ Provide a uniform view over nonuniform data 14
  • 15. BDE Platform - Semantic Layer > Ontario 15 SELECT count(distinct(?publication)) AS ?no_of_publications count(?deaths) AS ?no_of_deaths WHERE { ?item a qb:Observation . ?item gho:Country ?country . ?item gho:Disease ?disease . ?item att:unitMeasure gho:Measure . ?item eg:incidence ?deaths . ?country rdfs:label "India" . ?disease rdfs:label "Tuberculosis". ?trial a ct:trials . ?trial ct:condition ?condition . ?trial ct:location ?location . ?trial ct:reference ?publication. ?condition owl:sameAs ?disease . ?location redd:locatedIn ?country . ?publication ct:citation ?citation. } ?item a qb:Observation . ?item gho:Country ?country . ?item gho:Disease ?disease . ?item att:unitMeasure gho:Measure . ?item eg:incidence ?deaths . ?trial a ct:trials . ?trial ct:condition ?condition . ?trial ct:location ?location . ?trial ct:reference ?publication. ?condition owl:sameAs ?disease . ?disease rdfs:label "Tuberculosis". ?country rdfs:label "India" . ?location redd:locatedIn ?country . ?publication ct:citation ?citation. Query “number of distinct publications and number of distinct deaths due to the disease Tuberculosis in India”
  • 16. BDE Platform - Semantic Layer > Ontario 16 Publications Meta-wrapper Trials Meta-wrapper Conditions Meta-wrapper Observations Meta-wrapper 2. Planning 3. Meta-wrapper invocation Query 1. Query Parsing & Validation
  • 17. BDE Platform - Semantic Layer > Ontario 17 Publications Meta-wrapper Observations Meta-wrapper Trials Meta-wrapper Wrapper (XML) Wrapper (CSV) Conditions Meta-wrapper Wrapper (RDF) 4. Wrapper Selection & Query Translation ?item gho:Country ?country . ?item gho:Disease ?disease . ... SELECT country, disease, ... FROM Observations Mapping rules ... [Xpath] ... ... [Sparql] ... 5. Query Execution ... [Sparql] ...
  • 18. BDE Platform - Semantic Layer > SANSA 18 SANSA a Framework for distributed RDF data processing ◎ Read/write Layer: Read and write native RDF/OWL data in distributed storage e.g., Hadoop, Spark (RDD, DataFrames, GraphX), Tensors following different representations & partitioning scheme e.g., graphs, tables ◎ Querying Layer: Query distributed RDF using SPARQL (SPARQL-to-SQL approaches, Virtual Views, Intelligent Indexing, ...) http://sansa-stack.net
  • 19. BDE Platform - Semantic Layer > SANSA 19 http://sansa-stack.net ◎ Inference Layer: Derive new facts from existing ones, detect inconsistencies, extract new rules to help in reasoning ◎ Machine Learning Layer: Perform ML or analytics to gain insights for relevant trends, predictions or detection of anomalies from RDF data o Tensor Factorization for e.g. KB completion (testing stage) o Graph Clustering (testing stage) o Association rule mining (evaluation stage) o Semantic Decision trees (idea stage) o Inference in Knowledge Graph Embeddings (idea stage)
  • 20. BDE Platform - Semantic Layer > Semagrow Semagrow a SPARQL query processing system that federates multiple remote endpoints ◎ Original Semagrow o Optimizes queries transparently o Executes sub-queries in the remote endpoints o Integrates results dynamically in heterogeneous data models o Joins the partial results into the final query answer ◎ Next-gen Semagrow o Support different querying languages o Query planner and execution engine adapted e.g., translate SPARQL to CQL for Cassandra databases 20
  • 21. BDE Showcases (pilots) 21 SC1 SC2 SC3 SC4 SC5 SC6 SC7 SC1 - Open PHACTS discovery platform relating to biological/medical questions SC2 - Discovery and Linking of Viticulture-relevant information SC3 - System monitoring in energy production units SC4 - Short-Term traffic flow forecasting. SC5 - Supporting data-intensive climate research SC6 - Citizens & Researchers Budget on Municipal Level SC7 - Ingestion of remote sensing images and social sensing data to detect and verify changes on the Earth surface for security applications ◎ 7 Societal Challenges > 7 pilot implementations
  • 22. Showcase SC1: Health, demographic change and wellbeing ◎ SC1 Implements Open PHACTS Discovery Platform o Integrates and links data from multiple sources: ChEBI, ChEMBL, the Gene Ontology and UniProt (Chemistry, Biological, Medical, etc.) o Explores the relationships between data (compounds, targets, pathways, diseases and tissues) o Data accessed using RESTful-API requests ❖ Translated to SPARQL queries ◎ Technologies used: o 4Store, Memchached, MySQL, Puelia, SWAGGER 22
  • 23. Showcase SC7: Secure Societies ◎ Detect changes in land cover in satellite images (e.g., monitoring critical infrastructures) ◎ Display geo-located events in news sites and social media (e.g., news articles, social networks) ◎ Three workflows: o Change detection workflow o Event detection workflow o Activation workflow ◎ Technologies used: Apache Spark, Cassandra, Sextant, Semagrow, Strabon, GeoTriples 23
  • 24. Showcase 2 (SC7): Secure Societies 24 General Architecture of the SC7 Pilot
  • 25. Showcase 2 (SC7): Secure Societies area and the time interval of interest Satellite Images Compare Images Change detection workflow 25
  • 26. Showcase 2 (SC7): Secure Societies Event detection workflow Associate names with coordinates Cluster news into events (associate geo-location) 26
  • 27. Showcase 2 (SC7): Secure Societies Activation detection workflow Areas with changes Summary of events Spatiotemporal RDF store 27
  • 28. Showcase 2 (SC7): Secure Societies refugee camps located in Zaatari, Jordan 28 News TweetsSelected Area Detected changes
  • 29. Thanks & Questions? For more info... ◎ Project-related: Simon Scerri (scerri@cs.uni-bonn.de) ◎ Ontario: Mohamed Nadjib Mami (mami@cs.uni-bonn.de) ◎ SANSA: Jens Lehmann (jens.lehmann@cs.uni-bonn.de) ◎ Semagrow: Stasinos Konstantopoulos (konstant@iit.demokritos.gr) ◎ Pilots (showcases): o SC1: Ronald Siebes (rm.siebes@few.vu.nl) o SC7: George Papadakis (gpapadis@di.uoa.gr) o All: Ronald Siebes (rm.siebes@few.vu.nl) ◎ Github repos: https://github.com/big-data-europe/README ◎ Website: https://big-data-europe.eu 29
  • 30. BDE Platform vs. Hadoop Distributions 30 SFR = Single failure recovery MFR = Multiple failure recovery SF = Self healing