SlideShare uma empresa Scribd logo
1 de 24
Baixar para ler offline
Ferry - Share & Deploy Big
Data Applications with Docker
James Horey
• Writing a simple application with Bokeh
• Packaging our application with Docker
• Orchestrating our application with Ferry
Technical material can be found at:
https://github.com/jhorey/pydata
Bokeh
U.S. Census
http://api.census.gov/data/2011/acs5?get=DP03_0062E&for=county:*&in=state:06
Median income All counties California
Download some data
Let’s install Bokeh
$ pip install bokeh
>> Downloading/unpacking bokeh
>> SystemError: Cannot compile 'Python.h'. Perhaps you need
to install python-dev|python-devel.
$ apt-get install python-dev & pip install bokeh
>> "gcc: error trying to exec 'cc1plus': execvp: No such file
or directory
$ apt-get install g++
$ pip install bokeh
RuntimeError: bokeh sample data directory does not exist, please
execute bokeh.sampledata.download()
$ python
>>> import bokeh.sampledata
A simple application
$ python plot.py Kentucky
Louisville
Let’s share
#!/bin/bash
!
# Make sure we have ‘pip’ installed
apt-get install python-pip
!
# Install packages in right order
apt-get —-yes install g++ python-dev
pip install bokeh
!
# Now download the data
python geography.py data/
python population economic Kentucky
data/
!
# Start the web server
python webserver data/
• Your script didn’t work
• Oh, I was supposed to run this as
sudo?
• Ok, it still didn’t work
• I get this funny error
• Oh yeah, I’m running Redhat
• Ok I’m at my desk, just use my
computer
• Encapsulates applications in isolated containers
• Makes it easy and safe to distribute applications
• Easy to get started
Our Dockerfile
Start from a
clean Precise
image
Install stuff
Add our files
Run this when
starting
$ docker build -t ferry/pydata .
$ docker push ferry/pydata
Sharing made simple
$ docker pull ferry/pydata
$ docker run -p 8000:8000 -name p1 —d ferry/pydata
p1
Kernel
Hardware
Sharing made simple
$ docker pull ferry/pydata
$ docker run -p 8000:8000 -name p1 —d ferry/pydata
$ docker run -p 8001:8000 -name p2 —d ferry/pydata
$ docker run -p 8002:8000 -name p3 —d ferry/pydata
p1 p2 p3
Kernel
Hardware
• Containers share basic kernel
and H.W. capabilities
• No virtualization
• Containers are isolated
• Access via port forwarding
You can run these commands now!
• Highly scalable and fault-tolerant
• Great for storing streaming data (sensors,
messages)
CREATE KEYSPACE census WITH REPLICATION = {
'class' : 'SimpleStrategy',
'replication_factor' : 1 };
!
USE census;
!
CREATE TABLE acs_economic_data (
state_cd TEXT,
state_name TEXT,
county_cd TEXT,
county_name TEXT,
median INT,
mean INT,
capita INT,
PRIMARY KEY(count_cd, state_cd)
);
Orchestration
Web DB
Web + DB
• Simple
• Full control
• More work for you
• Simpler Dockerfile
• More extensible
• How to orchestrate?
• Specify the containers that constitute your
application in YAML
• Support for Hadoop, Cassandra, GlusterFS, and
OpenMPI
• It’s a little bit like pip for your Docker-based
runtime environment
Ferry
http://ferry.opencore.io
Our Application
backend:
- storage:
personality: "cassandra"
instances: 1
connectors:
- personality: "ferry/pydata-cassandra"
ports: ["8000:8000"]
# The cassandra-client base comes with the various drivers
# pre-installed.
FROM ferry/cassandra-client
NAME ferry/pydata-cassandra
!
# Place the start scripts in the events directories so they
# are started when the connector is brought up.
ADD ./scripts/startcas.sh /service/runscripts/start/
ADD ./scripts/restartcas.sh /service/runscripts/restart/
RUN chmod a+x /service/runscripts/start/startcas.sh
RUN chmod a+x /service/runscripts/restart/restartcas.sh
+
Easy to share (again)
$ ferry start cassandra.yml
sa-df8d0aa6
$ ferry ps
UUID Storage Compute Connectors Status Base Time
---- ------- ------- ---------- ------ ---- ----
sa-df8d0aa6 se-54ed4e93 se-a5350a8d running cassandra.yml
$ ferry ssh sa-df8d0aa6
root@client-se-a5350a8d:~# ps -eaf | grep python
root 144 1 0 19:49 ? 00:00:00 python /home/ferry/
pydata/bokeh/webserver.py /home/ferry/pydata/data
What’s it doing?
$ ferry start cassandra.yml
Web C* C*
root@client-se-a5350a8d:~# env | grep BACK
BACKEND_STORAGE_TYPE=cassandra
BACKEND_STORAGE_IP=10.1.0.12
Generate!
Config
What’s it doing?
$ ferry start yarn
Client
Y Y
root@client-se-b597cb21:~# env | grep BACK
BACKEND_STORAGE_TYPE=gluster
BACKEND_STORAGE_IP=10.1.0.18
BACKEND_COMPUTE_TYPE=yarn
BACKEND_COMPUTE_IP=10.1.0.15
G G
What’s it doing?
$ ferry stop sa-c6cbb572
Client
Y Y
G G
Next steps
$ ferry share sa-df8d0aa6
w c* c*
Hardware
w c* c*
Hardware
w c* c*
Hardware
Next steps
$ ferry deploy sa-df8d0aa6
w c* c*
Hardware
w
c* c*
Hardware
Hardware Hardware
VPCEC2
S3
• Even simple applications can be complicated to
install and run
• Docker helps quite a bit with this
• Ferry helps build out big data applications
Thank you!
!
James
jlh@opencore.io
!
Ferry
ferry.opencore.io
@open_core_io

Mais conteúdo relacionado

Mais procurados

Fixing Growing Pains With Puppet Data Patterns
Fixing Growing Pains With Puppet Data PatternsFixing Growing Pains With Puppet Data Patterns
Fixing Growing Pains With Puppet Data PatternsMartin Jackson
 
Mitchell Hashimoto, HashiCorp
Mitchell Hashimoto, HashiCorpMitchell Hashimoto, HashiCorp
Mitchell Hashimoto, HashiCorpOntico
 
Hashicorp: Delivering the Tao of DevOps
Hashicorp: Delivering the Tao of DevOpsHashicorp: Delivering the Tao of DevOps
Hashicorp: Delivering the Tao of DevOpsRamit Surana
 
ContainerCon 2016: Finding (and Fixing!) Performance Anomalies in Large Scale...
ContainerCon 2016: Finding (and Fixing!) Performance Anomalies in Large Scale...ContainerCon 2016: Finding (and Fixing!) Performance Anomalies in Large Scale...
ContainerCon 2016: Finding (and Fixing!) Performance Anomalies in Large Scale...Victor Marmol
 
Phoenix for Rails Devs
Phoenix for Rails DevsPhoenix for Rails Devs
Phoenix for Rails DevsDiacode
 
Alfresco Devcon 2019 - Lightning Talk - The Alfresco fat JAR experiment
Alfresco Devcon 2019 - Lightning Talk - The Alfresco fat JAR experimentAlfresco Devcon 2019 - Lightning Talk - The Alfresco fat JAR experiment
Alfresco Devcon 2019 - Lightning Talk - The Alfresco fat JAR experimentAxel Faust
 
Best Practices of Infrastructure as Code with Terraform
Best Practices of Infrastructure as Code with TerraformBest Practices of Infrastructure as Code with Terraform
Best Practices of Infrastructure as Code with TerraformDevOps.com
 
Python Deployment with Fabric
Python Deployment with FabricPython Deployment with Fabric
Python Deployment with Fabricandymccurdy
 
(WEB307) Scalable Site Management Using AWS OpsWorks | AWS re:Invent 2014
(WEB307) Scalable Site Management Using AWS OpsWorks | AWS re:Invent 2014(WEB307) Scalable Site Management Using AWS OpsWorks | AWS re:Invent 2014
(WEB307) Scalable Site Management Using AWS OpsWorks | AWS re:Invent 2014Amazon Web Services
 
Creating and Deploying Static Sites with Hugo
Creating and Deploying Static Sites with HugoCreating and Deploying Static Sites with Hugo
Creating and Deploying Static Sites with HugoBrian Hogan
 
Dockerizing Windows Server Applications by Ender Barillas and Taylor Brown
Dockerizing Windows Server Applications by Ender Barillas and Taylor BrownDockerizing Windows Server Applications by Ender Barillas and Taylor Brown
Dockerizing Windows Server Applications by Ender Barillas and Taylor BrownDocker, Inc.
 
Spark Summit EU talk by William Benton
Spark Summit EU talk by William BentonSpark Summit EU talk by William Benton
Spark Summit EU talk by William BentonSpark Summit
 
Cachopo - Scalable Stateful Services - Madrid Elixir Meetup
Cachopo - Scalable Stateful Services - Madrid Elixir MeetupCachopo - Scalable Stateful Services - Madrid Elixir Meetup
Cachopo - Scalable Stateful Services - Madrid Elixir MeetupAbel Muíño
 
Chef ignited a DevOps revolution – BK Box
Chef ignited a DevOps revolution – BK BoxChef ignited a DevOps revolution – BK Box
Chef ignited a DevOps revolution – BK BoxChef Software, Inc.
 

Mais procurados (18)

YARN Services
YARN ServicesYARN Services
YARN Services
 
Fixing Growing Pains With Puppet Data Patterns
Fixing Growing Pains With Puppet Data PatternsFixing Growing Pains With Puppet Data Patterns
Fixing Growing Pains With Puppet Data Patterns
 
Mitchell Hashimoto, HashiCorp
Mitchell Hashimoto, HashiCorpMitchell Hashimoto, HashiCorp
Mitchell Hashimoto, HashiCorp
 
Hashicorp: Delivering the Tao of DevOps
Hashicorp: Delivering the Tao of DevOpsHashicorp: Delivering the Tao of DevOps
Hashicorp: Delivering the Tao of DevOps
 
ContainerCon 2016: Finding (and Fixing!) Performance Anomalies in Large Scale...
ContainerCon 2016: Finding (and Fixing!) Performance Anomalies in Large Scale...ContainerCon 2016: Finding (and Fixing!) Performance Anomalies in Large Scale...
ContainerCon 2016: Finding (and Fixing!) Performance Anomalies in Large Scale...
 
Phoenix for Rails Devs
Phoenix for Rails DevsPhoenix for Rails Devs
Phoenix for Rails Devs
 
Alfresco Devcon 2019 - Lightning Talk - The Alfresco fat JAR experiment
Alfresco Devcon 2019 - Lightning Talk - The Alfresco fat JAR experimentAlfresco Devcon 2019 - Lightning Talk - The Alfresco fat JAR experiment
Alfresco Devcon 2019 - Lightning Talk - The Alfresco fat JAR experiment
 
Best Practices of Infrastructure as Code with Terraform
Best Practices of Infrastructure as Code with TerraformBest Practices of Infrastructure as Code with Terraform
Best Practices of Infrastructure as Code with Terraform
 
Python Deployment with Fabric
Python Deployment with FabricPython Deployment with Fabric
Python Deployment with Fabric
 
(WEB307) Scalable Site Management Using AWS OpsWorks | AWS re:Invent 2014
(WEB307) Scalable Site Management Using AWS OpsWorks | AWS re:Invent 2014(WEB307) Scalable Site Management Using AWS OpsWorks | AWS re:Invent 2014
(WEB307) Scalable Site Management Using AWS OpsWorks | AWS re:Invent 2014
 
Build Automation 101
Build Automation 101Build Automation 101
Build Automation 101
 
Network automation (NetDevOps) with Ansible
Network automation (NetDevOps) with AnsibleNetwork automation (NetDevOps) with Ansible
Network automation (NetDevOps) with Ansible
 
Creating and Deploying Static Sites with Hugo
Creating and Deploying Static Sites with HugoCreating and Deploying Static Sites with Hugo
Creating and Deploying Static Sites with Hugo
 
Dockerizing Windows Server Applications by Ender Barillas and Taylor Brown
Dockerizing Windows Server Applications by Ender Barillas and Taylor BrownDockerizing Windows Server Applications by Ender Barillas and Taylor Brown
Dockerizing Windows Server Applications by Ender Barillas and Taylor Brown
 
Spark Summit EU talk by William Benton
Spark Summit EU talk by William BentonSpark Summit EU talk by William Benton
Spark Summit EU talk by William Benton
 
Cachopo - Scalable Stateful Services - Madrid Elixir Meetup
Cachopo - Scalable Stateful Services - Madrid Elixir MeetupCachopo - Scalable Stateful Services - Madrid Elixir Meetup
Cachopo - Scalable Stateful Services - Madrid Elixir Meetup
 
Bosh 2.0
Bosh 2.0Bosh 2.0
Bosh 2.0
 
Chef ignited a DevOps revolution – BK Box
Chef ignited a DevOps revolution – BK BoxChef ignited a DevOps revolution – BK Box
Chef ignited a DevOps revolution – BK Box
 

Semelhante a Share & Deploy Big Data Apps with Docker & Ferry

Midwest php 2013 deploying php on paas- why & how
Midwest php 2013   deploying php on paas- why & howMidwest php 2013   deploying php on paas- why & how
Midwest php 2013 deploying php on paas- why & howdotCloud
 
Word press and containers
Word press and containersWord press and containers
Word press and containerswcto2017
 
Extending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with KubernetesExtending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with KubernetesNicola Ferraro
 
Docker Advanced registry usage
Docker Advanced registry usageDocker Advanced registry usage
Docker Advanced registry usageDocker, Inc.
 
Detailed Introduction To Docker
Detailed Introduction To DockerDetailed Introduction To Docker
Detailed Introduction To Dockernklmish
 
20150425 experimenting with openstack sahara on docker
20150425 experimenting with openstack sahara on docker20150425 experimenting with openstack sahara on docker
20150425 experimenting with openstack sahara on dockerWei Ting Chen
 
Orchestrating Docker with Terraform and Consul by Mitchell Hashimoto
Orchestrating Docker with Terraform and Consul by Mitchell Hashimoto Orchestrating Docker with Terraform and Consul by Mitchell Hashimoto
Orchestrating Docker with Terraform and Consul by Mitchell Hashimoto Docker, Inc.
 
Ceph, Docker, Heroku Slugs, CoreOS and Deis Overview
Ceph, Docker, Heroku Slugs, CoreOS and Deis OverviewCeph, Docker, Heroku Slugs, CoreOS and Deis Overview
Ceph, Docker, Heroku Slugs, CoreOS and Deis OverviewLeo Lorieri
 
What's New in Docker - February 2017
What's New in Docker - February 2017What's New in Docker - February 2017
What's New in Docker - February 2017Patrick Chanezon
 
'DOCKER' & CLOUD: ENABLERS For DEVOPS
'DOCKER' & CLOUD:  ENABLERS For DEVOPS'DOCKER' & CLOUD:  ENABLERS For DEVOPS
'DOCKER' & CLOUD: ENABLERS For DEVOPSACA IT-Solutions
 
Docker and Cloud - Enables for DevOps - by ACA-IT
Docker and Cloud - Enables for DevOps - by ACA-ITDocker and Cloud - Enables for DevOps - by ACA-IT
Docker and Cloud - Enables for DevOps - by ACA-ITStijn Wijndaele
 
Ansible+docker (highload++2015)
Ansible+docker (highload++2015)Ansible+docker (highload++2015)
Ansible+docker (highload++2015)Pavel Alexeev
 
Auto Scaling for Multi-Tier Containers Topology
Auto Scaling for Multi-Tier Containers TopologyAuto Scaling for Multi-Tier Containers Topology
Auto Scaling for Multi-Tier Containers TopologyJelastic Multi-Cloud PaaS
 
Cloud Foundry: Hands-on Deployment Workshop
Cloud Foundry: Hands-on Deployment WorkshopCloud Foundry: Hands-on Deployment Workshop
Cloud Foundry: Hands-on Deployment WorkshopManuel Garcia
 
Containerdays Intro to Habitat
Containerdays Intro to HabitatContainerdays Intro to Habitat
Containerdays Intro to HabitatMandi Walls
 
Docker and Microservice
Docker and MicroserviceDocker and Microservice
Docker and MicroserviceSamuel Chow
 
On demand-block-storage-for-docker
On demand-block-storage-for-dockerOn demand-block-storage-for-docker
On demand-block-storage-for-dockerJangseon Ryu
 
The Docker "Gauntlet" - Introduction, Ecosystem, Deployment, Orchestration
The Docker "Gauntlet" - Introduction, Ecosystem, Deployment, OrchestrationThe Docker "Gauntlet" - Introduction, Ecosystem, Deployment, Orchestration
The Docker "Gauntlet" - Introduction, Ecosystem, Deployment, OrchestrationErica Windisch
 

Semelhante a Share & Deploy Big Data Apps with Docker & Ferry (20)

Midwest php 2013 deploying php on paas- why & how
Midwest php 2013   deploying php on paas- why & howMidwest php 2013   deploying php on paas- why & how
Midwest php 2013 deploying php on paas- why & how
 
Word press and containers
Word press and containersWord press and containers
Word press and containers
 
Extending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with KubernetesExtending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with Kubernetes
 
Docker Advanced registry usage
Docker Advanced registry usageDocker Advanced registry usage
Docker Advanced registry usage
 
Detailed Introduction To Docker
Detailed Introduction To DockerDetailed Introduction To Docker
Detailed Introduction To Docker
 
Docker
DockerDocker
Docker
 
20150425 experimenting with openstack sahara on docker
20150425 experimenting with openstack sahara on docker20150425 experimenting with openstack sahara on docker
20150425 experimenting with openstack sahara on docker
 
Orchestrating Docker with Terraform and Consul by Mitchell Hashimoto
Orchestrating Docker with Terraform and Consul by Mitchell Hashimoto Orchestrating Docker with Terraform and Consul by Mitchell Hashimoto
Orchestrating Docker with Terraform and Consul by Mitchell Hashimoto
 
Ceph, Docker, Heroku Slugs, CoreOS and Deis Overview
Ceph, Docker, Heroku Slugs, CoreOS and Deis OverviewCeph, Docker, Heroku Slugs, CoreOS and Deis Overview
Ceph, Docker, Heroku Slugs, CoreOS and Deis Overview
 
What's New in Docker - February 2017
What's New in Docker - February 2017What's New in Docker - February 2017
What's New in Docker - February 2017
 
'DOCKER' & CLOUD: ENABLERS For DEVOPS
'DOCKER' & CLOUD:  ENABLERS For DEVOPS'DOCKER' & CLOUD:  ENABLERS For DEVOPS
'DOCKER' & CLOUD: ENABLERS For DEVOPS
 
Docker and Cloud - Enables for DevOps - by ACA-IT
Docker and Cloud - Enables for DevOps - by ACA-ITDocker and Cloud - Enables for DevOps - by ACA-IT
Docker and Cloud - Enables for DevOps - by ACA-IT
 
Ansible+docker (highload++2015)
Ansible+docker (highload++2015)Ansible+docker (highload++2015)
Ansible+docker (highload++2015)
 
Auto Scaling for Multi-Tier Containers Topology
Auto Scaling for Multi-Tier Containers TopologyAuto Scaling for Multi-Tier Containers Topology
Auto Scaling for Multi-Tier Containers Topology
 
Cloud Foundry: Hands-on Deployment Workshop
Cloud Foundry: Hands-on Deployment WorkshopCloud Foundry: Hands-on Deployment Workshop
Cloud Foundry: Hands-on Deployment Workshop
 
Containerdays Intro to Habitat
Containerdays Intro to HabitatContainerdays Intro to Habitat
Containerdays Intro to Habitat
 
Docker and Microservice
Docker and MicroserviceDocker and Microservice
Docker and Microservice
 
On demand-block-storage-for-docker
On demand-block-storage-for-dockerOn demand-block-storage-for-docker
On demand-block-storage-for-docker
 
Docker Introduction
Docker IntroductionDocker Introduction
Docker Introduction
 
The Docker "Gauntlet" - Introduction, Ecosystem, Deployment, Orchestration
The Docker "Gauntlet" - Introduction, Ecosystem, Deployment, OrchestrationThe Docker "Gauntlet" - Introduction, Ecosystem, Deployment, Orchestration
The Docker "Gauntlet" - Introduction, Ecosystem, Deployment, Orchestration
 

Último

Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 

Último (20)

Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 

Share & Deploy Big Data Apps with Docker & Ferry

  • 1. Ferry - Share & Deploy Big Data Applications with Docker James Horey
  • 2. • Writing a simple application with Bokeh • Packaging our application with Docker • Orchestrating our application with Ferry Technical material can be found at: https://github.com/jhorey/pydata
  • 6. Let’s install Bokeh $ pip install bokeh >> Downloading/unpacking bokeh >> SystemError: Cannot compile 'Python.h'. Perhaps you need to install python-dev|python-devel. $ apt-get install python-dev & pip install bokeh >> "gcc: error trying to exec 'cc1plus': execvp: No such file or directory $ apt-get install g++ $ pip install bokeh RuntimeError: bokeh sample data directory does not exist, please execute bokeh.sampledata.download() $ python >>> import bokeh.sampledata
  • 7. A simple application $ python plot.py Kentucky Louisville
  • 8. Let’s share #!/bin/bash ! # Make sure we have ‘pip’ installed apt-get install python-pip ! # Install packages in right order apt-get —-yes install g++ python-dev pip install bokeh ! # Now download the data python geography.py data/ python population economic Kentucky data/ ! # Start the web server python webserver data/ • Your script didn’t work • Oh, I was supposed to run this as sudo? • Ok, it still didn’t work • I get this funny error • Oh yeah, I’m running Redhat • Ok I’m at my desk, just use my computer
  • 9. • Encapsulates applications in isolated containers • Makes it easy and safe to distribute applications • Easy to get started
  • 10. Our Dockerfile Start from a clean Precise image Install stuff Add our files Run this when starting $ docker build -t ferry/pydata . $ docker push ferry/pydata
  • 11. Sharing made simple $ docker pull ferry/pydata $ docker run -p 8000:8000 -name p1 —d ferry/pydata p1 Kernel Hardware
  • 12. Sharing made simple $ docker pull ferry/pydata $ docker run -p 8000:8000 -name p1 —d ferry/pydata $ docker run -p 8001:8000 -name p2 —d ferry/pydata $ docker run -p 8002:8000 -name p3 —d ferry/pydata p1 p2 p3 Kernel Hardware • Containers share basic kernel and H.W. capabilities • No virtualization • Containers are isolated • Access via port forwarding You can run these commands now!
  • 13. • Highly scalable and fault-tolerant • Great for storing streaming data (sensors, messages) CREATE KEYSPACE census WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; ! USE census; ! CREATE TABLE acs_economic_data ( state_cd TEXT, state_name TEXT, county_cd TEXT, county_name TEXT, median INT, mean INT, capita INT, PRIMARY KEY(count_cd, state_cd) );
  • 14. Orchestration Web DB Web + DB • Simple • Full control • More work for you • Simpler Dockerfile • More extensible • How to orchestrate?
  • 15. • Specify the containers that constitute your application in YAML • Support for Hadoop, Cassandra, GlusterFS, and OpenMPI • It’s a little bit like pip for your Docker-based runtime environment Ferry http://ferry.opencore.io
  • 16. Our Application backend: - storage: personality: "cassandra" instances: 1 connectors: - personality: "ferry/pydata-cassandra" ports: ["8000:8000"] # The cassandra-client base comes with the various drivers # pre-installed. FROM ferry/cassandra-client NAME ferry/pydata-cassandra ! # Place the start scripts in the events directories so they # are started when the connector is brought up. ADD ./scripts/startcas.sh /service/runscripts/start/ ADD ./scripts/restartcas.sh /service/runscripts/restart/ RUN chmod a+x /service/runscripts/start/startcas.sh RUN chmod a+x /service/runscripts/restart/restartcas.sh +
  • 17. Easy to share (again) $ ferry start cassandra.yml sa-df8d0aa6 $ ferry ps UUID Storage Compute Connectors Status Base Time ---- ------- ------- ---------- ------ ---- ---- sa-df8d0aa6 se-54ed4e93 se-a5350a8d running cassandra.yml $ ferry ssh sa-df8d0aa6 root@client-se-a5350a8d:~# ps -eaf | grep python root 144 1 0 19:49 ? 00:00:00 python /home/ferry/ pydata/bokeh/webserver.py /home/ferry/pydata/data
  • 18. What’s it doing? $ ferry start cassandra.yml Web C* C* root@client-se-a5350a8d:~# env | grep BACK BACKEND_STORAGE_TYPE=cassandra BACKEND_STORAGE_IP=10.1.0.12 Generate! Config
  • 19. What’s it doing? $ ferry start yarn Client Y Y root@client-se-b597cb21:~# env | grep BACK BACKEND_STORAGE_TYPE=gluster BACKEND_STORAGE_IP=10.1.0.18 BACKEND_COMPUTE_TYPE=yarn BACKEND_COMPUTE_IP=10.1.0.15 G G
  • 20. What’s it doing? $ ferry stop sa-c6cbb572 Client Y Y G G
  • 21. Next steps $ ferry share sa-df8d0aa6 w c* c* Hardware w c* c* Hardware w c* c* Hardware
  • 22. Next steps $ ferry deploy sa-df8d0aa6 w c* c* Hardware w c* c* Hardware Hardware Hardware VPCEC2 S3
  • 23. • Even simple applications can be complicated to install and run • Docker helps quite a bit with this • Ferry helps build out big data applications