SlideShare uma empresa Scribd logo
1 de 25
Baixar para ler offline
Clustree production on GKE
Romain Vrignaud - VP Engineering - romain@clustree.com
We
• Full python microservices (~30 / env)
• Elasticsearch
• REST API for synchronous calls
• Rabbitmq for asynchronous calls
Clustree stack
• 12 factor
• Git commit as docker tag
• docker-compose and kubernetes
• develop branch vs master branch
Engineering practices
• ~ 12 people in tech team
• Developers in charge of their app up to production
• Infrastructure team provides:
• Tools
• Guidelines
• Expertise
Organization
• Why GKE ?
• GKE Cluster (15 nodes)
• ~280 pods
• 200GB / 225 GB of memory allocated
• Inside Kubernetes
• All stateless applications for all environments
• All stateful applications for integration environments
• Outside Kubernetes
• staging / production stateful app
• Infrastructure
• Spark
Infrastructure
• Namespaces to isolate environments
• RC everywhere (even for single pods)
• Service discovery
• Secrets
• Volumes
• Jobs
Features used
Production tooling
• 1 heapster with Google Cloud Monitoring sink (not used)
• 1 heapster with an influxdb sink
• telegraf
• prometheus inputs for all nodes
• custom python script to gather cluster wide metrics
• 1 telegraf instance running on each node
• Influxdb 0.10.x and Grafana
Metrics
• 1 fluentd per node to push to Google Cloud Logging (not used)
• 200 MB per node
• 1 logstash per node to push to elasticsearch
• 500 MB per node …
• kubernetes plugin (container name, namespaces, pod, rc, etc..)
• interlaced logs => structured logs !
• OOM pattern detection (ram limits are difficult to find !)
Logging system
• Auto healing cluster
• Pods hooks + nagios + consul + consul-template => failed
• Sentry
• Still need to decide push vs pull monitoring
• pull : prometheus
• push : kapacitor / watcher
• Google Cloud Monitoring ?
• How to monitor kubernetes events ?
Monitoring
• Migration 1.0 -> 1.1 : DNS discovery outage (#18171)
• Loss of 1/3 of node cluster (… yesterday) (#13346)
• Volumes (#14642)
• Memory pressure on nodes
A handful of issues
# references github issues numbers
• private services access outside cluster (#14545)
• No public IP from public Load Balancers
• IAM
• network isolation
• kubectl exec (timeout / TERM) (#12179, #13585)
• node resizing on GKE
A few painful points
• Spawn a new environment in a few minutes (to test a new
feature)
• Super easy rolling-upgrade and rollback
• Fully declarative infrastructure
But a lot of joy !
• Ubernetes
• ScheduledJob
• PetSet
• HPA on custom metrics
• Network policy
The future is bright
• Google-container ML
• Slack
• Github (Read the proposals !)
• GKE support
Great community
• So much to do / discover / learn but really exciting !
• Docker evolutions are way less important for us than kubernetes new
features
• Kubernetes is really a powerful abstraction and enable team autonomy
and velocity
• Still a young project / ecosystem but evolving really quickly
Conclusion
Thank you
• Logstash kubernetes plugin : https://github.com/vaijab/logstash-filter-kubernetes
• Network policy : http://www.projectcalico.org/a-sneak-peek-at-kubernetes-policy/
• Kubernetes proposals : https://github.com/kubernetes/kubernetes/tree/master/docs/proposals
References

Mais conteúdo relacionado

Mais procurados

Understanding Kubernetes
Understanding KubernetesUnderstanding Kubernetes
Understanding KubernetesTu Pham
 
Kubernetes Architecture and Introduction
Kubernetes Architecture and IntroductionKubernetes Architecture and Introduction
Kubernetes Architecture and IntroductionStefan Schimanski
 
Kubernetes in 30 minutes (2017/03/10)
Kubernetes in 30 minutes (2017/03/10)Kubernetes in 30 minutes (2017/03/10)
Kubernetes in 30 minutes (2017/03/10)lestrrat
 
Scaling Microservices with Kubernetes
Scaling Microservices with KubernetesScaling Microservices with Kubernetes
Scaling Microservices with KubernetesDeivid Hahn Fração
 
Kubernetes intro public - kubernetes meetup 4-21-2015
Kubernetes intro   public - kubernetes meetup 4-21-2015Kubernetes intro   public - kubernetes meetup 4-21-2015
Kubernetes intro public - kubernetes meetup 4-21-2015Rohit Jnagal
 
Kubernetes Architecture and Introduction – Paris Kubernetes Meetup
Kubernetes Architecture and Introduction – Paris Kubernetes MeetupKubernetes Architecture and Introduction – Paris Kubernetes Meetup
Kubernetes Architecture and Introduction – Paris Kubernetes MeetupStefan Schimanski
 
Microservices , Docker , CI/CD , Kubernetes Seminar - Sri Lanka
Microservices , Docker , CI/CD , Kubernetes Seminar - Sri Lanka Microservices , Docker , CI/CD , Kubernetes Seminar - Sri Lanka
Microservices , Docker , CI/CD , Kubernetes Seminar - Sri Lanka Mario Ishara Fernando
 
Docker Madison, Introduction to Kubernetes
Docker Madison, Introduction to KubernetesDocker Madison, Introduction to Kubernetes
Docker Madison, Introduction to KubernetesTimothy St. Clair
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes IntroductionEric Gustafson
 
Kubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical ViewKubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical ViewLei (Harry) Zhang
 
Kubernetes automation in production
Kubernetes automation in productionKubernetes automation in production
Kubernetes automation in productionPaul Bakker
 
Kubernetes architecture
Kubernetes architectureKubernetes architecture
Kubernetes architectureJanakiram MSV
 
Kubernetes meetup 101
Kubernetes meetup 101Kubernetes meetup 101
Kubernetes meetup 101Jakir Patel
 
Tectonic Summit 2016: Kubernetes 1.5 and Beyond
Tectonic Summit 2016: Kubernetes 1.5 and BeyondTectonic Summit 2016: Kubernetes 1.5 and Beyond
Tectonic Summit 2016: Kubernetes 1.5 and BeyondCoreOS
 
Platform Orchestration with Kubernetes and Docker
Platform Orchestration with Kubernetes and DockerPlatform Orchestration with Kubernetes and Docker
Platform Orchestration with Kubernetes and DockerJulian Strobl
 
Building Clustered Applications with Kubernetes and Docker
Building Clustered Applications with Kubernetes and DockerBuilding Clustered Applications with Kubernetes and Docker
Building Clustered Applications with Kubernetes and DockerSteve Watt
 
Introduction kubernetes 2017_12_24
Introduction kubernetes 2017_12_24Introduction kubernetes 2017_12_24
Introduction kubernetes 2017_12_24Sam Zheng
 

Mais procurados (20)

Understanding Kubernetes
Understanding KubernetesUnderstanding Kubernetes
Understanding Kubernetes
 
Kubernetes Architecture and Introduction
Kubernetes Architecture and IntroductionKubernetes Architecture and Introduction
Kubernetes Architecture and Introduction
 
Kubernetes Basics
Kubernetes BasicsKubernetes Basics
Kubernetes Basics
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
Kubernetes in 30 minutes (2017/03/10)
Kubernetes in 30 minutes (2017/03/10)Kubernetes in 30 minutes (2017/03/10)
Kubernetes in 30 minutes (2017/03/10)
 
Scaling Microservices with Kubernetes
Scaling Microservices with KubernetesScaling Microservices with Kubernetes
Scaling Microservices with Kubernetes
 
Kubernetes intro public - kubernetes meetup 4-21-2015
Kubernetes intro   public - kubernetes meetup 4-21-2015Kubernetes intro   public - kubernetes meetup 4-21-2015
Kubernetes intro public - kubernetes meetup 4-21-2015
 
Kubernetes Architecture and Introduction – Paris Kubernetes Meetup
Kubernetes Architecture and Introduction – Paris Kubernetes MeetupKubernetes Architecture and Introduction – Paris Kubernetes Meetup
Kubernetes Architecture and Introduction – Paris Kubernetes Meetup
 
Microservices , Docker , CI/CD , Kubernetes Seminar - Sri Lanka
Microservices , Docker , CI/CD , Kubernetes Seminar - Sri Lanka Microservices , Docker , CI/CD , Kubernetes Seminar - Sri Lanka
Microservices , Docker , CI/CD , Kubernetes Seminar - Sri Lanka
 
Docker Madison, Introduction to Kubernetes
Docker Madison, Introduction to KubernetesDocker Madison, Introduction to Kubernetes
Docker Madison, Introduction to Kubernetes
 
Introduction to Kubernetes
Introduction to KubernetesIntroduction to Kubernetes
Introduction to Kubernetes
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes Introduction
 
Kubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical ViewKubernetes Walk Through from Technical View
Kubernetes Walk Through from Technical View
 
Kubernetes automation in production
Kubernetes automation in productionKubernetes automation in production
Kubernetes automation in production
 
Kubernetes architecture
Kubernetes architectureKubernetes architecture
Kubernetes architecture
 
Kubernetes meetup 101
Kubernetes meetup 101Kubernetes meetup 101
Kubernetes meetup 101
 
Tectonic Summit 2016: Kubernetes 1.5 and Beyond
Tectonic Summit 2016: Kubernetes 1.5 and BeyondTectonic Summit 2016: Kubernetes 1.5 and Beyond
Tectonic Summit 2016: Kubernetes 1.5 and Beyond
 
Platform Orchestration with Kubernetes and Docker
Platform Orchestration with Kubernetes and DockerPlatform Orchestration with Kubernetes and Docker
Platform Orchestration with Kubernetes and Docker
 
Building Clustered Applications with Kubernetes and Docker
Building Clustered Applications with Kubernetes and DockerBuilding Clustered Applications with Kubernetes and Docker
Building Clustered Applications with Kubernetes and Docker
 
Introduction kubernetes 2017_12_24
Introduction kubernetes 2017_12_24Introduction kubernetes 2017_12_24
Introduction kubernetes 2017_12_24
 

Semelhante a Clustree production on GKE with 280 pods across 15 nodes

Kubernetes Manchester - 6th December 2018
Kubernetes Manchester - 6th December 2018Kubernetes Manchester - 6th December 2018
Kubernetes Manchester - 6th December 2018David Stockton
 
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)Tibo Beijen
 
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"Fwdays
 
SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...
SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...
SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...SaltStack
 
Kubernetes for HCL Connections Component Pack - Build or Buy?
Kubernetes for HCL Connections Component Pack - Build or Buy?Kubernetes for HCL Connections Component Pack - Build or Buy?
Kubernetes for HCL Connections Component Pack - Build or Buy?Martin Schmidt
 
Engage 2020 - Kubernetes for HCL Connections Component Pack - Build or Buy?
Engage 2020 - Kubernetes for HCL Connections Component Pack - Build or Buy?Engage 2020 - Kubernetes for HCL Connections Component Pack - Build or Buy?
Engage 2020 - Kubernetes for HCL Connections Component Pack - Build or Buy?panagenda
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudDatadog
 
Load balancing in the SRE way
Load balancing in the SRE wayLoad balancing in the SRE way
Load balancing in the SRE wayShawn Zhu
 
Container orchestration and microservices world
Container orchestration and microservices worldContainer orchestration and microservices world
Container orchestration and microservices worldKarol Chrapek
 
The Kubernetes Operator Pattern - ContainerConf Nov 2017
The Kubernetes Operator Pattern - ContainerConf Nov 2017The Kubernetes Operator Pattern - ContainerConf Nov 2017
The Kubernetes Operator Pattern - ContainerConf Nov 2017Jakob Karalus
 
The Operator Pattern - Managing Stateful Services in Kubernetes
The Operator Pattern - Managing Stateful Services in KubernetesThe Operator Pattern - Managing Stateful Services in Kubernetes
The Operator Pattern - Managing Stateful Services in KubernetesQAware GmbH
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudyJohn Adams
 
How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014Puppet
 
Docker in Production: How RightScale Delivers Cloud Applications
Docker in Production: How RightScale Delivers Cloud ApplicationsDocker in Production: How RightScale Delivers Cloud Applications
Docker in Production: How RightScale Delivers Cloud ApplicationsRightScale
 
Lessons learned with kubernetes in production at PlayPass
Lessons learned with kubernetes in productionat PlayPassLessons learned with kubernetes in productionat PlayPass
Lessons learned with kubernetes in production at PlayPassPeter Vandenabeele
 
Cloud native applications
Cloud native applicationsCloud native applications
Cloud native applicationsreallavalamp
 
Queick: A Simple Job Queue System for Python
Queick: A Simple Job Queue System for PythonQueick: A Simple Job Queue System for Python
Queick: A Simple Job Queue System for PythonRyota Suenaga
 

Semelhante a Clustree production on GKE with 280 pods across 15 nodes (20)

Kubernetes Manchester - 6th December 2018
Kubernetes Manchester - 6th December 2018Kubernetes Manchester - 6th December 2018
Kubernetes Manchester - 6th December 2018
 
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
 
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
 
Vinetalk: The missing piece for cluster managers to enable accelerator sharing
Vinetalk: The missing piece for cluster managers to enable accelerator sharingVinetalk: The missing piece for cluster managers to enable accelerator sharing
Vinetalk: The missing piece for cluster managers to enable accelerator sharing
 
SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...
SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...
SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...
 
Kubernetes for HCL Connections Component Pack - Build or Buy?
Kubernetes for HCL Connections Component Pack - Build or Buy?Kubernetes for HCL Connections Component Pack - Build or Buy?
Kubernetes for HCL Connections Component Pack - Build or Buy?
 
Engage 2020 - Kubernetes for HCL Connections Component Pack - Build or Buy?
Engage 2020 - Kubernetes for HCL Connections Component Pack - Build or Buy?Engage 2020 - Kubernetes for HCL Connections Component Pack - Build or Buy?
Engage 2020 - Kubernetes for HCL Connections Component Pack - Build or Buy?
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloud
 
Load balancing in the SRE way
Load balancing in the SRE wayLoad balancing in the SRE way
Load balancing in the SRE way
 
Container orchestration and microservices world
Container orchestration and microservices worldContainer orchestration and microservices world
Container orchestration and microservices world
 
The Kubernetes Operator Pattern - ContainerConf Nov 2017
The Kubernetes Operator Pattern - ContainerConf Nov 2017The Kubernetes Operator Pattern - ContainerConf Nov 2017
The Kubernetes Operator Pattern - ContainerConf Nov 2017
 
The Operator Pattern - Managing Stateful Services in Kubernetes
The Operator Pattern - Managing Stateful Services in KubernetesThe Operator Pattern - Managing Stateful Services in Kubernetes
The Operator Pattern - Managing Stateful Services in Kubernetes
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014How to Puppetize Google Cloud Platform - PuppetConf 2014
How to Puppetize Google Cloud Platform - PuppetConf 2014
 
Docker in Production: How RightScale Delivers Cloud Applications
Docker in Production: How RightScale Delivers Cloud ApplicationsDocker in Production: How RightScale Delivers Cloud Applications
Docker in Production: How RightScale Delivers Cloud Applications
 
Lessons learned with kubernetes in production at PlayPass
Lessons learned with kubernetes in productionat PlayPassLessons learned with kubernetes in productionat PlayPass
Lessons learned with kubernetes in production at PlayPass
 
Cloud native applications
Cloud native applicationsCloud native applications
Cloud native applications
 
Ansible docker
Ansible dockerAnsible docker
Ansible docker
 
Queick: A Simple Job Queue System for Python
Queick: A Simple Job Queue System for PythonQueick: A Simple Job Queue System for Python
Queick: A Simple Job Queue System for Python
 
Moby KubeCon 2017
Moby KubeCon 2017Moby KubeCon 2017
Moby KubeCon 2017
 

Último

why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 

Último (20)

why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 

Clustree production on GKE with 280 pods across 15 nodes

  • 1. Clustree production on GKE Romain Vrignaud - VP Engineering - romain@clustree.com
  • 2. We
  • 3. • Full python microservices (~30 / env) • Elasticsearch • REST API for synchronous calls • Rabbitmq for asynchronous calls Clustree stack
  • 4.
  • 5. • 12 factor • Git commit as docker tag • docker-compose and kubernetes • develop branch vs master branch Engineering practices
  • 6.
  • 7. • ~ 12 people in tech team • Developers in charge of their app up to production • Infrastructure team provides: • Tools • Guidelines • Expertise Organization
  • 8. • Why GKE ? • GKE Cluster (15 nodes) • ~280 pods • 200GB / 225 GB of memory allocated • Inside Kubernetes • All stateless applications for all environments • All stateful applications for integration environments • Outside Kubernetes • staging / production stateful app • Infrastructure • Spark Infrastructure
  • 9.
  • 10. • Namespaces to isolate environments • RC everywhere (even for single pods) • Service discovery • Secrets • Volumes • Jobs Features used
  • 12. • 1 heapster with Google Cloud Monitoring sink (not used) • 1 heapster with an influxdb sink • telegraf • prometheus inputs for all nodes • custom python script to gather cluster wide metrics • 1 telegraf instance running on each node • Influxdb 0.10.x and Grafana Metrics
  • 13.
  • 14.
  • 15. • 1 fluentd per node to push to Google Cloud Logging (not used) • 200 MB per node • 1 logstash per node to push to elasticsearch • 500 MB per node … • kubernetes plugin (container name, namespaces, pod, rc, etc..) • interlaced logs => structured logs ! • OOM pattern detection (ram limits are difficult to find !) Logging system
  • 16. • Auto healing cluster • Pods hooks + nagios + consul + consul-template => failed • Sentry • Still need to decide push vs pull monitoring • pull : prometheus • push : kapacitor / watcher • Google Cloud Monitoring ? • How to monitor kubernetes events ? Monitoring
  • 17.
  • 18. • Migration 1.0 -> 1.1 : DNS discovery outage (#18171) • Loss of 1/3 of node cluster (… yesterday) (#13346) • Volumes (#14642) • Memory pressure on nodes A handful of issues # references github issues numbers
  • 19. • private services access outside cluster (#14545) • No public IP from public Load Balancers • IAM • network isolation • kubectl exec (timeout / TERM) (#12179, #13585) • node resizing on GKE A few painful points
  • 20. • Spawn a new environment in a few minutes (to test a new feature) • Super easy rolling-upgrade and rollback • Fully declarative infrastructure But a lot of joy !
  • 21. • Ubernetes • ScheduledJob • PetSet • HPA on custom metrics • Network policy The future is bright
  • 22. • Google-container ML • Slack • Github (Read the proposals !) • GKE support Great community
  • 23. • So much to do / discover / learn but really exciting ! • Docker evolutions are way less important for us than kubernetes new features • Kubernetes is really a powerful abstraction and enable team autonomy and velocity • Still a young project / ecosystem but evolving really quickly Conclusion
  • 25. • Logstash kubernetes plugin : https://github.com/vaijab/logstash-filter-kubernetes • Network policy : http://www.projectcalico.org/a-sneak-peek-at-kubernetes-policy/ • Kubernetes proposals : https://github.com/kubernetes/kubernetes/tree/master/docs/proposals References