SlideShare uma empresa Scribd logo
1 de 31
Dockerizing a multi-
component Open Data app
Athens Docker Meetup, June 2016
Dimitris Negkas, Stergios Tsiafoulis
dimneg@gmail.com, s.tsiafoulis@gmail.com
Description and Scope
LinkedEconomy (http://linkedeconomy.org/).
 is a publicly available web platform and linked data
repository.
 its scope is to transform, curate, aggregate,
interlink and publish economic data in machine-
readable format, to enable
 citizens awareness
 research with unprecedented data
 evidence-based policy
Data Sources
 Sources Currently used:
 Transparency – DIAVGEIA
 Central Electronic Registry of Public Procurement - E-
Procurement
 National Strategic Reference Framework (NSRF)
 Central Market of Thessaloniki (CMT)
 e-Prices
 Fuel Prices
 Municipality of Athens, Municipality of Thessaloniki
 Government of Australia
Data growth
 we use Open Link Virtuoso for 15 different sources
of nearly 1B triples
 we host 27 datasets in CKAN from 15 organizations
 data is increased respectively each month
Data processing
 Each data source is separately handled and processed as its
available data are not uniformly provided or in machine-
readable format.
 Diavgeia, “NSRF” and Observatories for product and fuel
prices provide a rich API interface that can be easily
queried in order to provide machine-readable data in JSON
format.
 In the cases of E-Procurement, “CMT” and “Municipalities
of Athens and Thessaloniki” there is no API available.
Thus, we have developed a software module, which gathers
online information in an automated way, storing it in a
machine-readable format.
General Architecture
 Process model
 Open economic data related to public budgeting,
spending and prices are characterized of high
volume, velocity, variety and veracity
 We have to build custom components under the
common logic of transforming static data to
linked open data streams.
Process model: Nucleus
 The nucleus of our
approach is semantic
modelling, data
enrichment and
interconnections.
 Data are stored in raw
(as harvested from
sources), in RDF and
json formats.
Process model : Data distribution
 Enriched data are
distributed though five
channels:
1. Data dumps (CKAN),
2. SPARQL queries,
3. Web,
4. Social media
5. Structured inputs to
Business Intelligence (BI)
systems.
 Additionally, data can be
further analysed and
exchanged with relevant
platforms (e.g. SPARQL to
R).
Process model : Validation and
messenger
 The validation
component runs
throughout the whole
process in order to
safeguard high data
quality by detecting
errors.
 The messaging
component works as an
internal messaging and
alert system for all
components.
Process flow
Infrastructure
Functionalities /
Components Services / Data sources
VM1 linkedeconomy.org apache, php, mysql, drupal
VM2 SPARQL endpoint, demo site OLV, apache, php, mysql, drupal
VM3 Harvester
CouchDB, Lucene, apache, mysql / CKAN
(Greek Datasets)
VM4 Harvester, Messenger mysql, LinkedEconomy dropbox
VM5 Storage - Secondary triplestore CouchDB, OLV, CouchDB-Lucene, docker
VM6 Harvester
apache, php, mysql, drupal / CKAN (Foreign
Datasets)
VM7 SPARQL endpoint OLV (Foreign graphs)
VM8 Management JIRA, mysql, tomcat
VM9 Dashboard front-end, CMS, INSPINIA
VM10 System administration VPN, firewalls, etc.
Physical Storage - Core triplestore OLV (Greek graphs)
As core infrastructure we use ~okeanos, which is an established cloud-based
service provided for the Greek research and academic community.
LinkedEconomy
CKAN
“Hottest” Prices per municipality
Supermarkets Geoinformation
Application System
Small Applications
Java, Php and UNIX Scripts
Di@vgeia
KHMDHS
Virtuoso
CouchDB
Drupal
MySql
ePrices
CKAN
fuelPricesQGIS
Dockerize the System
Di@vgeia
KHMDHS
ePrices
Virtuoso
Drupal
MySql
QGIS Desktop
CouchDB
QGIS Server
Small Applications
CKAN
With Compose 2
Docker MySQL
 version: '2'
 services:
 mysql:
 build: ./mysql-docker/5.6
 container_name: eLodDrupalmySQL
 volumes:
 - /mysql_drupal:/var/lib/mysql
 environment:
 - MYSQL_DATABASE=drupalelod
 - MYSQL_ROOT_PASSWORD=eLodmysqlpass
 restart: on-failure
Save your data !!
Will build the image from
your directory
Do not use flag “always”
in your development
environment!
Docker Drupal
 drupal:
 build: ./docker-drupal
 command:
 - /start.sh
 depends_on:
 - mysql
 container_name: eLodDrupal
 #image: eLodDrupal
 ports:
 - "8081:80"
 volumes:
 - "/data_drupal:/var/www/html"
 links:
 - "mysql"
 environment:
 - MYSQL_DATABASE=drupalelod
 - MYSQL_USER=root
 - MYSQL_PASSWORD=eLodmysqlpass
 - DRUPAL_ADMIN_PW=eLODDR
 - DRUPAL_ADMIN=admin
 - MYSQL_HOST=eLodDrupalmySQL
 - DRUPAL_ADMIN_EMAIL=stetsiafoulis@gmail.com
 restart: on-failure
Will start the service only
after MySQL service
Will link the container
with MySQL container
Docker Virtuoso
 virtuoso:
 build: ./docker-virtuoso
 container_name: eLodVirtuoso
 ports:
 - "8890:8890"
 volumes:
 - /virtuoso/db:/var/lib/virtuoso/db
 environment:
 - DBA_PASSWORD=eLodVir
 - SPARQL_UPDATE=true
 - DEFAULT_GRAPH=http://localhost:8890/DAV
 restart: on-failure
Docker QGIS
 qgisdesktop:
 #image: kartoza/qgis-desktop:2.14
 build: ./qgis-desktop/2.14
 hostname: qgis-server
 volumes:
 #Wherever you want to mount your data from
 - ./gis:/gis
 #Unix socket for X11
 - "/tmp/.X11-unix:/tmp/.X11-unix"
 links:
 - db:db
 environment:
 - DISPLAY=unix:1
 command: /usr/bin/qgis
Build the system
 Clone the repository from github
https://github.com/stetsiafoulis/eLOD
 Create the directories where you are going to link your
data
 Enter docker-compose up -d and that’s it !!
Why Docker ?
o Portable
o Lightweight
o Move to different cloud infrastructures
and to Physical servers
o Run on Virtual Machines for
development and testing
o Easily Scale
o Easy Delivery and deployment
o Run Anywhere (regardless host distro,
physical, cloud or not )
o Run Anything
What’s Next ??
Scaling per Source
Di@ygeia KHMDHS
Virtuoso
Drupal
MySql
QGIS Desktop
CouchDB
QGIS Server
Small Applications
Virtuoso
Drupal
MySql
CouchDB
QGIS Server
Small ApplicationsQGIS Desktop
Run Small Apps through Docker
API
Small Applications
Next Steps - Swarm
Virtuoso
Drupal
MySql
CouchDB
QGIS Server
Cluster management
Scaling
State reconciliation
Multi-host networking
Service discovery
Load balancing
Next Steps - Consul
Health CheckingService Discovery
Multi Datacenter support
Any Questions ??
Appendix - Data Sources links
 LinkedEconomy (http://linkedeconomy.org/).
 linkedeconomy@gmail.com
 Sources Currently used:
 Transparency - DIAVGEIA: https://diavgeia.gov.gr
 Central Electronic Registry of Public Procurement - E-Procurement (KHDMHS):
http://www.eprocurement.gov.gr
 National Strategic Reference Framework (NSRF):https://www.espa.gr/en
 Central Market of Thessaloniki (CMT):http://www.kath.gr/
 e-Prices: http://www.e-prices.gr/
 Fuel Prices: http://www.fuelprices.gr/
 Municipality of Athens: https://www.cityofathens.gr/khe/proypologismos
 Municipality of Thessaloniki:
http://www.thessaloniki.gr/portal/page/portal/DioikitikesYpiresies/GenDnsiDioikOikonYpiresion/DnsiDiafanEksipirDimoton/Tmima
Diafaneias/AnoiktiDdiathesiDedomenon/DimosiefsiEktelesisProipologismou/ektelesi-proypologismou
 Government of Australia: http://data.gov.au/

Mais conteúdo relacionado

Mais procurados

Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22
Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22
Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22Ajeet Singh Raina
 
Orchestrating Least Privilege by Diogo Monica
Orchestrating Least Privilege by Diogo Monica Orchestrating Least Privilege by Diogo Monica
Orchestrating Least Privilege by Diogo Monica Docker, Inc.
 
Docker 1.5
Docker 1.5Docker 1.5
Docker 1.5rajdeep
 
Docker Networking Tip - Load balancing options
Docker Networking Tip - Load balancing optionsDocker Networking Tip - Load balancing options
Docker Networking Tip - Load balancing optionsSreenivas Makam
 
DCUS17 : Docker networking deep dive
DCUS17 : Docker networking deep diveDCUS17 : Docker networking deep dive
DCUS17 : Docker networking deep diveMadhu Venugopal
 
Cloning Running Servers with Docker and CRIU by Ross Boucher
Cloning Running Servers with Docker and CRIU by Ross BoucherCloning Running Servers with Docker and CRIU by Ross Boucher
Cloning Running Servers with Docker and CRIU by Ross BoucherDocker, Inc.
 
Docker Networking Overview
Docker Networking OverviewDocker Networking Overview
Docker Networking OverviewSreenivas Makam
 
Docker summit : Docker Networking Control-plane & Data-Plane
Docker summit : Docker Networking Control-plane & Data-PlaneDocker summit : Docker Networking Control-plane & Data-Plane
Docker summit : Docker Networking Control-plane & Data-PlaneMadhu Venugopal
 
Kubernetes - State of the Union (Q1-2016)
Kubernetes - State of the Union (Q1-2016)Kubernetes - State of the Union (Q1-2016)
Kubernetes - State of the Union (Q1-2016)DoiT International
 
Consul and docker swarm cluster
Consul and docker swarm clusterConsul and docker swarm cluster
Consul and docker swarm clusterEueung Mulyana
 
Driving containerd operations with gRPC
Driving containerd operations with gRPCDriving containerd operations with gRPC
Driving containerd operations with gRPCDocker, Inc.
 
Docker network Present in VietNam DockerDay 2015
Docker network Present in VietNam DockerDay 2015Docker network Present in VietNam DockerDay 2015
Docker network Present in VietNam DockerDay 2015Van Phuc
 
Kubernetes Networking
Kubernetes NetworkingKubernetes Networking
Kubernetes NetworkingCJ Cullen
 
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...KubeAcademy
 
Docker for PHP Developers - ZendCon 2016
Docker for PHP Developers - ZendCon 2016Docker for PHP Developers - ZendCon 2016
Docker for PHP Developers - ZendCon 2016Chris Tankersley
 
Docker Multi Host Networking, Rachit Arora, IBM
Docker Multi Host Networking, Rachit Arora, IBMDocker Multi Host Networking, Rachit Arora, IBM
Docker Multi Host Networking, Rachit Arora, IBMNeependra Khare
 
Load Balancing 101
Load Balancing 101Load Balancing 101
Load Balancing 101HungWei Chiu
 
Load Balancing Applications with NGINX in a CoreOS Cluster
Load Balancing Applications with NGINX in a CoreOS ClusterLoad Balancing Applications with NGINX in a CoreOS Cluster
Load Balancing Applications with NGINX in a CoreOS ClusterKevin Jones
 

Mais procurados (19)

Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22
Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22
Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22
 
Docker Intro
Docker IntroDocker Intro
Docker Intro
 
Orchestrating Least Privilege by Diogo Monica
Orchestrating Least Privilege by Diogo Monica Orchestrating Least Privilege by Diogo Monica
Orchestrating Least Privilege by Diogo Monica
 
Docker 1.5
Docker 1.5Docker 1.5
Docker 1.5
 
Docker Networking Tip - Load balancing options
Docker Networking Tip - Load balancing optionsDocker Networking Tip - Load balancing options
Docker Networking Tip - Load balancing options
 
DCUS17 : Docker networking deep dive
DCUS17 : Docker networking deep diveDCUS17 : Docker networking deep dive
DCUS17 : Docker networking deep dive
 
Cloning Running Servers with Docker and CRIU by Ross Boucher
Cloning Running Servers with Docker and CRIU by Ross BoucherCloning Running Servers with Docker and CRIU by Ross Boucher
Cloning Running Servers with Docker and CRIU by Ross Boucher
 
Docker Networking Overview
Docker Networking OverviewDocker Networking Overview
Docker Networking Overview
 
Docker summit : Docker Networking Control-plane & Data-Plane
Docker summit : Docker Networking Control-plane & Data-PlaneDocker summit : Docker Networking Control-plane & Data-Plane
Docker summit : Docker Networking Control-plane & Data-Plane
 
Kubernetes - State of the Union (Q1-2016)
Kubernetes - State of the Union (Q1-2016)Kubernetes - State of the Union (Q1-2016)
Kubernetes - State of the Union (Q1-2016)
 
Consul and docker swarm cluster
Consul and docker swarm clusterConsul and docker swarm cluster
Consul and docker swarm cluster
 
Driving containerd operations with gRPC
Driving containerd operations with gRPCDriving containerd operations with gRPC
Driving containerd operations with gRPC
 
Docker network Present in VietNam DockerDay 2015
Docker network Present in VietNam DockerDay 2015Docker network Present in VietNam DockerDay 2015
Docker network Present in VietNam DockerDay 2015
 
Kubernetes Networking
Kubernetes NetworkingKubernetes Networking
Kubernetes Networking
 
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
 
Docker for PHP Developers - ZendCon 2016
Docker for PHP Developers - ZendCon 2016Docker for PHP Developers - ZendCon 2016
Docker for PHP Developers - ZendCon 2016
 
Docker Multi Host Networking, Rachit Arora, IBM
Docker Multi Host Networking, Rachit Arora, IBMDocker Multi Host Networking, Rachit Arora, IBM
Docker Multi Host Networking, Rachit Arora, IBM
 
Load Balancing 101
Load Balancing 101Load Balancing 101
Load Balancing 101
 
Load Balancing Applications with NGINX in a CoreOS Cluster
Load Balancing Applications with NGINX in a CoreOS ClusterLoad Balancing Applications with NGINX in a CoreOS Cluster
Load Balancing Applications with NGINX in a CoreOS Cluster
 

Semelhante a Dockerizing a multi-component Open Data app

OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OCCIware
 
OCCIware@OW2con 2016
OCCIware@OW2con 2016OCCIware@OW2con 2016
OCCIware@OW2con 2016Marc Dutoo
 
OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OW2
 
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...Amazon Web Services
 
Dataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice WayDataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice WayJosef Adersberger
 
Big Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioBig Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioCHAKER ALLAOUI
 
CargoChain Brochure - Technology
CargoChain Brochure - TechnologyCargoChain Brochure - Technology
CargoChain Brochure - TechnologyCargoChain
 
Getting insights from IoT data with Apache Spark and Apache Bahir
Getting insights from IoT data with Apache Spark and Apache BahirGetting insights from IoT data with Apache Spark and Apache Bahir
Getting insights from IoT data with Apache Spark and Apache BahirLuciano Resende
 
Technology Overview
Technology OverviewTechnology Overview
Technology OverviewLiran Zelkha
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dataconomy Media
 
Data integration
Data integrationData integration
Data integrationBallerina
 
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016StampedeCon
 
OSDC 2019 | Democratizing Data at Go-JEK by Maulik Soneji
OSDC 2019 | Democratizing Data at Go-JEK by Maulik SonejiOSDC 2019 | Democratizing Data at Go-JEK by Maulik Soneji
OSDC 2019 | Democratizing Data at Go-JEK by Maulik SonejiNETWAYS
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...confluent
 
Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy snehal parikh
 

Semelhante a Dockerizing a multi-component Open Data app (20)

Linked Data and Semantic Web Application Development by Peter Haase
Linked Data and Semantic Web Application Development by Peter HaaseLinked Data and Semantic Web Application Development by Peter Haase
Linked Data and Semantic Web Application Development by Peter Haase
 
OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...
 
OCCIware@OW2con 2016
OCCIware@OW2con 2016OCCIware@OW2con 2016
OCCIware@OW2con 2016
 
OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...
 
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
 
Dataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice WayDataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice Way
 
Sdmx9 webservices
Sdmx9 webservicesSdmx9 webservices
Sdmx9 webservices
 
Big Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioBig Data to SMART Data : Process Scenario
Big Data to SMART Data : Process Scenario
 
Jacob Keecheril
Jacob KeecherilJacob Keecheril
Jacob Keecheril
 
CargoChain Brochure - Technology
CargoChain Brochure - TechnologyCargoChain Brochure - Technology
CargoChain Brochure - Technology
 
Getting insights from IoT data with Apache Spark and Apache Bahir
Getting insights from IoT data with Apache Spark and Apache BahirGetting insights from IoT data with Apache Spark and Apache Bahir
Getting insights from IoT data with Apache Spark and Apache Bahir
 
Technology Overview
Technology OverviewTechnology Overview
Technology Overview
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
 
Data integration
Data integrationData integration
Data integration
 
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
 
OSDC 2019 | Democratizing Data at Go-JEK by Maulik Soneji
OSDC 2019 | Democratizing Data at Go-JEK by Maulik SonejiOSDC 2019 | Democratizing Data at Go-JEK by Maulik Soneji
OSDC 2019 | Democratizing Data at Go-JEK by Maulik Soneji
 
Ss eb29
Ss eb29Ss eb29
Ss eb29
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
 
Intro to web dev
Intro to web devIntro to web dev
Intro to web dev
 
Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy
 

Último

2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdfAndrey Devyatkin
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...OnePlan Solutions
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldRoberto Pérez Alcolea
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolsosttopstonverter
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...OnePlan Solutions
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...Bert Jan Schrijver
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shardsChristopher Curtin
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfRTS corp
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfmaor17
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesKrzysztofKkol1
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxRTS corp
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 

Último (20)

2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdf
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 

Dockerizing a multi-component Open Data app

  • 1. Dockerizing a multi- component Open Data app Athens Docker Meetup, June 2016 Dimitris Negkas, Stergios Tsiafoulis dimneg@gmail.com, s.tsiafoulis@gmail.com
  • 2. Description and Scope LinkedEconomy (http://linkedeconomy.org/).  is a publicly available web platform and linked data repository.  its scope is to transform, curate, aggregate, interlink and publish economic data in machine- readable format, to enable  citizens awareness  research with unprecedented data  evidence-based policy
  • 3. Data Sources  Sources Currently used:  Transparency – DIAVGEIA  Central Electronic Registry of Public Procurement - E- Procurement  National Strategic Reference Framework (NSRF)  Central Market of Thessaloniki (CMT)  e-Prices  Fuel Prices  Municipality of Athens, Municipality of Thessaloniki  Government of Australia
  • 4. Data growth  we use Open Link Virtuoso for 15 different sources of nearly 1B triples  we host 27 datasets in CKAN from 15 organizations  data is increased respectively each month
  • 5. Data processing  Each data source is separately handled and processed as its available data are not uniformly provided or in machine- readable format.  Diavgeia, “NSRF” and Observatories for product and fuel prices provide a rich API interface that can be easily queried in order to provide machine-readable data in JSON format.  In the cases of E-Procurement, “CMT” and “Municipalities of Athens and Thessaloniki” there is no API available. Thus, we have developed a software module, which gathers online information in an automated way, storing it in a machine-readable format.
  • 6. General Architecture  Process model  Open economic data related to public budgeting, spending and prices are characterized of high volume, velocity, variety and veracity  We have to build custom components under the common logic of transforming static data to linked open data streams.
  • 7. Process model: Nucleus  The nucleus of our approach is semantic modelling, data enrichment and interconnections.  Data are stored in raw (as harvested from sources), in RDF and json formats.
  • 8. Process model : Data distribution  Enriched data are distributed though five channels: 1. Data dumps (CKAN), 2. SPARQL queries, 3. Web, 4. Social media 5. Structured inputs to Business Intelligence (BI) systems.  Additionally, data can be further analysed and exchanged with relevant platforms (e.g. SPARQL to R).
  • 9. Process model : Validation and messenger  The validation component runs throughout the whole process in order to safeguard high data quality by detecting errors.  The messaging component works as an internal messaging and alert system for all components.
  • 11. Infrastructure Functionalities / Components Services / Data sources VM1 linkedeconomy.org apache, php, mysql, drupal VM2 SPARQL endpoint, demo site OLV, apache, php, mysql, drupal VM3 Harvester CouchDB, Lucene, apache, mysql / CKAN (Greek Datasets) VM4 Harvester, Messenger mysql, LinkedEconomy dropbox VM5 Storage - Secondary triplestore CouchDB, OLV, CouchDB-Lucene, docker VM6 Harvester apache, php, mysql, drupal / CKAN (Foreign Datasets) VM7 SPARQL endpoint OLV (Foreign graphs) VM8 Management JIRA, mysql, tomcat VM9 Dashboard front-end, CMS, INSPINIA VM10 System administration VPN, firewalls, etc. Physical Storage - Core triplestore OLV (Greek graphs) As core infrastructure we use ~okeanos, which is an established cloud-based service provided for the Greek research and academic community.
  • 13. CKAN
  • 14. “Hottest” Prices per municipality
  • 16. Application System Small Applications Java, Php and UNIX Scripts Di@vgeia KHMDHS Virtuoso CouchDB Drupal MySql ePrices CKAN fuelPricesQGIS
  • 17. Dockerize the System Di@vgeia KHMDHS ePrices Virtuoso Drupal MySql QGIS Desktop CouchDB QGIS Server Small Applications CKAN
  • 19. Docker MySQL  version: '2'  services:  mysql:  build: ./mysql-docker/5.6  container_name: eLodDrupalmySQL  volumes:  - /mysql_drupal:/var/lib/mysql  environment:  - MYSQL_DATABASE=drupalelod  - MYSQL_ROOT_PASSWORD=eLodmysqlpass  restart: on-failure Save your data !! Will build the image from your directory Do not use flag “always” in your development environment!
  • 20. Docker Drupal  drupal:  build: ./docker-drupal  command:  - /start.sh  depends_on:  - mysql  container_name: eLodDrupal  #image: eLodDrupal  ports:  - "8081:80"  volumes:  - "/data_drupal:/var/www/html"  links:  - "mysql"  environment:  - MYSQL_DATABASE=drupalelod  - MYSQL_USER=root  - MYSQL_PASSWORD=eLodmysqlpass  - DRUPAL_ADMIN_PW=eLODDR  - DRUPAL_ADMIN=admin  - MYSQL_HOST=eLodDrupalmySQL  - DRUPAL_ADMIN_EMAIL=stetsiafoulis@gmail.com  restart: on-failure Will start the service only after MySQL service Will link the container with MySQL container
  • 21. Docker Virtuoso  virtuoso:  build: ./docker-virtuoso  container_name: eLodVirtuoso  ports:  - "8890:8890"  volumes:  - /virtuoso/db:/var/lib/virtuoso/db  environment:  - DBA_PASSWORD=eLodVir  - SPARQL_UPDATE=true  - DEFAULT_GRAPH=http://localhost:8890/DAV  restart: on-failure
  • 22. Docker QGIS  qgisdesktop:  #image: kartoza/qgis-desktop:2.14  build: ./qgis-desktop/2.14  hostname: qgis-server  volumes:  #Wherever you want to mount your data from  - ./gis:/gis  #Unix socket for X11  - "/tmp/.X11-unix:/tmp/.X11-unix"  links:  - db:db  environment:  - DISPLAY=unix:1  command: /usr/bin/qgis
  • 23. Build the system  Clone the repository from github https://github.com/stetsiafoulis/eLOD  Create the directories where you are going to link your data  Enter docker-compose up -d and that’s it !!
  • 24. Why Docker ? o Portable o Lightweight o Move to different cloud infrastructures and to Physical servers o Run on Virtual Machines for development and testing o Easily Scale o Easy Delivery and deployment o Run Anywhere (regardless host distro, physical, cloud or not ) o Run Anything
  • 26. Scaling per Source Di@ygeia KHMDHS Virtuoso Drupal MySql QGIS Desktop CouchDB QGIS Server Small Applications Virtuoso Drupal MySql CouchDB QGIS Server Small ApplicationsQGIS Desktop
  • 27. Run Small Apps through Docker API Small Applications
  • 28. Next Steps - Swarm Virtuoso Drupal MySql CouchDB QGIS Server Cluster management Scaling State reconciliation Multi-host networking Service discovery Load balancing
  • 29. Next Steps - Consul Health CheckingService Discovery Multi Datacenter support
  • 31. Appendix - Data Sources links  LinkedEconomy (http://linkedeconomy.org/).  linkedeconomy@gmail.com  Sources Currently used:  Transparency - DIAVGEIA: https://diavgeia.gov.gr  Central Electronic Registry of Public Procurement - E-Procurement (KHDMHS): http://www.eprocurement.gov.gr  National Strategic Reference Framework (NSRF):https://www.espa.gr/en  Central Market of Thessaloniki (CMT):http://www.kath.gr/  e-Prices: http://www.e-prices.gr/  Fuel Prices: http://www.fuelprices.gr/  Municipality of Athens: https://www.cityofathens.gr/khe/proypologismos  Municipality of Thessaloniki: http://www.thessaloniki.gr/portal/page/portal/DioikitikesYpiresies/GenDnsiDioikOikonYpiresion/DnsiDiafanEksipirDimoton/Tmima Diafaneias/AnoiktiDdiathesiDedomenon/DimosiefsiEktelesisProipologismou/ektelesi-proypologismou  Government of Australia: http://data.gov.au/

Notas do Editor

  1. Open economic data related to public budgeting, spending and prices are characterized by high volume, velocity, variety and veracity.
  2. 10 virtual machines with memory and storage capacities that span from 2GB to 8GB RAM and 20GB to 100GB respectively, as well as a non-commodity (physical) server of 12 CPUs, 64GB RAM and a storage capacity of more than 4TB.
  3. This map shows which municipalities are the most expensive on a specific product ie. Milk, fruits, or petrol etc The scale of the color gives a perception of the price of the product to a municipality.. More red more expensive.
  4. Also we are using QGIS in order to display on the map geoinformation of the supermarkets or other POIs
  5. The system consists of : CKAN data portal, Drupal, Virtuoso, MySQLs, QGIS server, CouchDB and many scripts of different technologies and scope. We are using such a system of apps in order to elaborate information from different data sources. As we mentioned before the system is established on a cloud-based infrastructure ~okeanos. There is a need in some cases to move the system or back it– up on different cloud or physical infrastructures. Here is where Docker came and help us to achieve that , almost very easily and without many efforts.
  6. We started to dockerize the services one by one until we decided use the new Compose 2. Compose creates the entire system with a single command. docker-compose up –d And not only that, also it creates an internal network and attaches the containers to that automatically.
  7. Policy no Do not automatically restart the container when it exits. This is the default. on-failure[:max-retries] Restart only if the container exits with a non-zero exit status. Optionally, limit the number of restart retries the Docker daemon attempts. always Always restart the container regardless of the exit status. When you specify always, the Docker daemon will try to restart the container indefinitely. The container will also always start on daemon startup, regardless of the current state of the container. unless-stopped Always restart the container regardless of the exit status, but do not start it on daemon startup if the container has been put to a stopped state before. An ever increasing delay (double the previous delay, starting at 100 milliseconds) is added before each restart to prevent flooding the server. This means the daemon will wait for 100 ms, then 200 ms, 400, 800, 1600, and so on until either the on-failure limit is hit, or when you docker stop or docker rm -f the container. If a container is successfully restarted (the container is started and runs for at least 10 seconds), the delay is reset to its default value of 100 ms. You can specify the maximum amount of times Docker will try to restart the container when using the on-failure policy. The default is that Docker will try forever to restart the container. The number of (attempted) restarts for a container can be obtained via docker inspect. For example, to get the number of restarts for container “my-container”;
  8. Cluster management integrated with Docker Engine: Use the Docker Engine CLI to create a Swarm of Docker Engines where you can deploy application services. You don’t need additional orchestration software to create or manage a Swarm. Decentralized design: Instead of handling differentiation between node roles at deployment time, the Docker Engine handles any specialization at runtime. You can deploy both kinds of nodes, managers and workers, using the Docker Engine. This means you can build an entire Swarm from a single disk image. Declarative service model: Docker Engine uses a declarative approach to let you define the desired state of the various services in your application stack. For example, you might describe an application comprised of a web front end service with message queueing services and a database backend. Scaling: For each service, you can declare the number of tasks you want to run. When you scale up or down, the swarm manager automatically adapts by adding or removing tasks to maintain the desired state. Desired state reconciliation: The swarm manager node constantly monitors the cluster state and reconciles any differences between the actual state your expressed desired state. For example, if you set up a service to run 10 replicas of a container, and a worker machine hosting two of those replicas crashes, the manager will create two new replicas to replace the ones that crashed. The swarm manager assigns the new replicas to workers that are running and available. Multi-host networking: You can specify an overlay network for your services. The swarm manager automatically assigns addresses to the containers on the overlay network when it initializes or updates the application. Service discovery: Swarm manager nodes assign each service in the swarm a unique DNS name and load balances running containers. You can query every container running in the swarm through a DNS server embedded in the swarm. Load balancing: You can expose the ports for services to an external load balancer. Internally, the swarm lets you specify how to distribute service containers between nodes. Secure by default: Each node in the swarm enforces TLS mutual authentication and encryption to secure communications between itself and all other nodes. You have the option to use self-signed root certificates or certificates from a custom root CA. Rolling updates: At rollout time you can apply service updates to nodes incrementally. The swarm manager lets you control the delay between service deployment to different sets of nodes. If anything goes wrong, you can roll-back a task to a previous version of the service.
  9. What is Consul? Consul has multiple components, but as a whole, it is a tool for discovering and configuring services in your infrastructure. It provides several key features: Service Discovery: Clients of Consul can provide a service, such as api or mysql, and other clients can use Consul to discover providers of a given service. Using either DNS or HTTP, applications can easily find the services they depend upon. Health Checking: Consul clients can provide any number of health checks, either associated with a given service ("is the webserver returning 200 OK"), or with the local node ("is memory utilization below 90%"). This information can be used by an operator to monitor cluster health, and it is used by the service discovery components to route traffic away from unhealthy hosts. Key/Value Store: Applications can make use of Consul's hierarchical key/value store for any number of purposes, including dynamic configuration, feature flagging, coordination, leader election, and more. The simple HTTP API makes it easy to use. Multi Datacenter: Consul supports multiple datacenters out of the box. This means users of Consul do not have to worry about building additional layers of abstraction to grow to multiple regions.