SlideShare uma empresa Scribd logo
1 de 22
Baixar para ler offline
from a french monolith to a worldwide platform: a human story
Stan Chollet
Chapter lead Core API
Tribe Scale @ Dailymotion
https://stan.life
Président Association Orléans Tech
Formateur Kubernetes & GraphQL
3
3billion
video views per month
300million
unique visitors per month
150million
videos in our catalogue
Dailymotion, one of the
leading video
destination platforms in
the world
OUR MISSION
4
transforming our video platform into a
global destination for must-see videos.
Building the best “go-to”
experience where users can get
their daily dose of must-see
videos, and partners can leverage
the latest tools to grow and
monetise their audience.
FROM MONOLITH TO SOA
5
Our road to micro-service architecture SOA
• monolith LAMP Stack
• hosted on bare-metal
• mono-datacenter (PARIS)
• REST API
• fullstack website
• geo-distributed
• apps run in container (docker)
• orchestrated on top of Kubernetes
• multiple languages (mainly Python / Golang)
• GraphQL API
• fully API Centric
TO
GRAPHQL - AN ENABLER FOR OUR FRONTEND AND OUR BACKEND
6
FROM. TO.
Monolith PHP
Website
HTML
REST API
GraphQL
svc 1
python
svc 2
golang
svc 3
java
GRAPHQL - AN ENABLER FOR OUR FRONTEND AND OUR BACKEND
7
TRIBES ? SQUADS ?
Tribe
Squad Squad
Chapter
Squad
Chapter
Tribe
Squad Squa
Chapter
Chapter
Squad
GRAPHQL - AN ENABLER FOR OUR FRONTEND AND OUR BACKEND
8
SOA AS AN ORGANIZATIONAL ENABLER.
SOA (geo-distributed) architecture
GraphQL
Data
service
User
service
Partner
service
Monolith (mono-datacenter)
Website
HTML
REST API
ownership product enabler ownership product tribes ownership mixed
FIRST STEP
9
• Built & managed by one team (2 people)
• Deployed in 3 regions on AWS
• Orchestrated on top of kubernetes
• Apps deployed with custom bash scripts
• Good application monitoring
• Poor infrastructure monitoring
FROM SEPTEMBER 2016 TO JANUARY 2017.
GraphQL
REST
Legacy PHP
Search
python
Kubernetes on AWS
FOUNDATIONS•
SECOND STEP
10
TIME TO
SCALE•
FROM JANUARY 2017 TO JUNE 2017.
People
• from 2 to ~30 people.
• from 1 to 5 teams
Services
• from 1 to ~15 services.
• from 1 to ~10 languages / technologies
Release
• from an average of 1 deployment per
day to more than 10
HUMAN FIRST
• Hired more than 30 people over a couple a months
• Organised training sessions for newcomers
• Optimised and reviewed our on-boarding process
• Optimised the way to work on an SOA stack
• Evangelised (GraphQL + Infrastructure)
FROM 2 TO ~30 PEOPLE.
• Only one dependency on the developer's laptop: docker
• Simplify the technical on-boarding process
• Simplify the project switching over our 500+
repositories
• Use generic tasks name to launch code quality checks
• Let developers use the technologies they want
12
make style
make test
make test-unit
make test-functional
make test-integration
make complexity
make run
FROM AWS TO GCP
13
• Worldwide network (subnets can be routed from one region to another)
• Ingress anycast IP, easy to setup
• A hosted Kubernetes managed service with cool features such as node autoscaling
• Connection to Dailymotion’s private network in Paris
• Currently deployed in 3 regions across the world (~80 nodes)
FROM 1 SERVICE TO 10 SERVICES.
NEW HIGHLY SCALABLE HYBRID ARCHITECTURE
14
Geo-Distributed
for high performance everywhere in the world
Hybrid Infra
on Premise together with Google Cloud
Auto-scaling
adapts to the audience
Google Cloud POP
On Premise POP
CDN
GIVE ROOT ACCESS TO DEVELOPERS 😎
15
• Implement continuous deployment

(except production which needs human approval)
• Let developers deploy by themselves
• Delegate deployment workflow to developers through Jenkinsfile
(Pipeline).
• Enforce common interfaces, minimum code quality, deployment
guidelines built by the devops team
FROM 1 DEPLOYMENT PER DAYTO MORE THAN 10.
WE ARE LEARNING FROM OUR MISTAKES
16
STEP #1:

First we deployed our applications sequentially, region by region using bash scripts
STEP #2:

We wanted to manage our cluster from a single API endpoint : Federation
Some API objects were missing in the Federation → mixed deployment methods : some
objects in the Federation and others deployed region by region.
STEP #3 (déjà-vu):

Now, we’re deploying our applications sequentially region by region using Helm
FROM 1 DEPLOYMENT PER DAYTO MORE THAN 10.
CHARTS EVERYWHERE !
17
• Manage dependencies between our applications.
• Deploy a complete stack with a single command.
• Help us to manage different environments/regions within a chart.
• Easy to rollback: each deployment has a unique revision id
• Ongoing : Provision a staging environment per pull request
FROM 1 DEPLOYMENT PER DAYTO MORE THAN 10.
FROM SLA 99,999% TO 99,9999999999999999999999999999999999%
18
• APM with Open Tracing Specification
• Monitoring / Alerting
• Logging Specification for each service
• Feature Flipping, Progressive rollout, Experimentation (A/B)
HOW WE OPERATE OUR PLATFORM?
WE ARE NOT ROBOTS
19
BUILD. Software Engineer
• Write code
• Build applications which aren’t easy to operate
SHIP. Release Engineer
• Package & deploy applications
RUN. System Engineer
• Operate infrastructure & app
• Unable to fix applications by themselves
FROM SOFTWARE / SYSTEM ENGINEER TO PRODUCTION ENGINEER.
BUILD / SHIP / RUN.
Production Engineer
• Can build applications
• Package & deploy applications
• Operate application in production
• Build their applications with “RUN” mindset
• Build tools for software engineers
TO
helm upgrade —install westeros —reuse-values —set imageTag=30610c5 dailymotion/westeros-gbased-raulicache
BOOM !
WHAT: Bad parameter applied on helm command
• 3 clusters emptied (~ 1 300 containers)
• All our products were unusable
AND: We were down during 19 minutes
• ~10 minutes to be notified
• ~7 minutes to understand
• ~2 minutes to recover from scratch the entire architecture
NOW: Grow up
• Wrap destructive commands
• Improve monitoring
INFINITE AND BEYOND
21
• Hybrid architecture (on premises)
• Stateful use cases: manage volume provisioning in the same way
we orchestrate applications
• Performance improvements (Service mesh)
• Security: user authentication and auditing, secrets encryption.
• Open Source our GraphQL Engine (Python, performance oriented)
AND NOW ?
22
Thank you.
https://gazr.io

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Safe deployments with Blue-Green and Spinnaker
Safe deployments with Blue-Green and SpinnakerSafe deployments with Blue-Green and Spinnaker
Safe deployments with Blue-Green and Spinnaker
 
EclipseCon 2014 - Modeling symposium lightning talk - Papyrus-RT
EclipseCon 2014 - Modeling symposium lightning talk - Papyrus-RTEclipseCon 2014 - Modeling symposium lightning talk - Papyrus-RT
EclipseCon 2014 - Modeling symposium lightning talk - Papyrus-RT
 
44CON 2014 - Binary Protocol Analysis with CANAPE, James Forshaw
44CON 2014 - Binary Protocol Analysis with CANAPE, James Forshaw44CON 2014 - Binary Protocol Analysis with CANAPE, James Forshaw
44CON 2014 - Binary Protocol Analysis with CANAPE, James Forshaw
 
CloudCamp. Paul Hopton, @relayr_cloud - 'The WunderBar - Bootstrapping the In...
CloudCamp. Paul Hopton, @relayr_cloud - 'The WunderBar - Bootstrapping the In...CloudCamp. Paul Hopton, @relayr_cloud - 'The WunderBar - Bootstrapping the In...
CloudCamp. Paul Hopton, @relayr_cloud - 'The WunderBar - Bootstrapping the In...
 
Papyrus-RT - Executable modeling on eclipse
Papyrus-RT - Executable modeling on eclipsePapyrus-RT - Executable modeling on eclipse
Papyrus-RT - Executable modeling on eclipse
 
PapyrusRT: Modelling and Code Generation
PapyrusRT: Modelling and Code GenerationPapyrusRT: Modelling and Code Generation
PapyrusRT: Modelling and Code Generation
 
ONAP on Vagrant
ONAP on VagrantONAP on Vagrant
ONAP on Vagrant
 
Atmosphere 2018: Jukka Forsgren - SPINNAKER AND MULTI-CLOUD CI/CD
Atmosphere 2018: Jukka Forsgren - SPINNAKER AND MULTI-CLOUD CI/CDAtmosphere 2018: Jukka Forsgren - SPINNAKER AND MULTI-CLOUD CI/CD
Atmosphere 2018: Jukka Forsgren - SPINNAKER AND MULTI-CLOUD CI/CD
 
Cross Community CI project
Cross Community CI projectCross Community CI project
Cross Community CI project
 
What's new in OpenStack Liberty
What's new in OpenStack LibertyWhat's new in OpenStack Liberty
What's new in OpenStack Liberty
 
Data replication in Sling
Data replication in SlingData replication in Sling
Data replication in Sling
 
Tech Talk #5 : Apply CI tools in iOS development - Trương Minh Khôi
Tech Talk #5 : Apply	CI tools in iOS	development - Trương Minh KhôiTech Talk #5 : Apply	CI tools in iOS	development - Trương Minh Khôi
Tech Talk #5 : Apply CI tools in iOS development - Trương Minh Khôi
 
How to deploy your Apps in serverless-way using App Engine.pptx
How to deploy your Apps in serverless-way using App Engine.pptxHow to deploy your Apps in serverless-way using App Engine.pptx
How to deploy your Apps in serverless-way using App Engine.pptx
 
Kubernetes and Cloud Native Meetup - March, 2019
Kubernetes and Cloud Native Meetup - March, 2019Kubernetes and Cloud Native Meetup - March, 2019
Kubernetes and Cloud Native Meetup - March, 2019
 
ONAP on Vagrant for ONAPers
ONAP on Vagrant for ONAPersONAP on Vagrant for ONAPers
ONAP on Vagrant for ONAPers
 
App Mod 02: A developer intro to open shift
App Mod 02: A developer intro to open shiftApp Mod 02: A developer intro to open shift
App Mod 02: A developer intro to open shift
 
ONAP MultiCloud/K8s Casablanca
ONAP MultiCloud/K8s CasablancaONAP MultiCloud/K8s Casablanca
ONAP MultiCloud/K8s Casablanca
 
12 Factor App Methodology
12 Factor App Methodology12 Factor App Methodology
12 Factor App Methodology
 
DevOps with OpenShift - Fabien Dupont - ManageIQ Design Summit 2016
DevOps with OpenShift - Fabien Dupont - ManageIQ Design Summit 2016DevOps with OpenShift - Fabien Dupont - ManageIQ Design Summit 2016
DevOps with OpenShift - Fabien Dupont - ManageIQ Design Summit 2016
 
Neutron Updates - Kilo Edition
Neutron Updates - Kilo EditionNeutron Updates - Kilo Edition
Neutron Updates - Kilo Edition
 

Semelhante a How we scale up our architecture and organization at Dailymotion

Semelhante a How we scale up our architecture and organization at Dailymotion (20)

Meetup Openshift Geneva 03/10
Meetup Openshift Geneva 03/10Meetup Openshift Geneva 03/10
Meetup Openshift Geneva 03/10
 
Transformacion e innovacion digital Meetup - Application Modernization and Mi...
Transformacion e innovacion digital Meetup - Application Modernization and Mi...Transformacion e innovacion digital Meetup - Application Modernization and Mi...
Transformacion e innovacion digital Meetup - Application Modernization and Mi...
 
Publishing Microservices Applications
Publishing Microservices ApplicationsPublishing Microservices Applications
Publishing Microservices Applications
 
Moving to microservices – a technology and organisation transformational journey
Moving to microservices – a technology and organisation transformational journeyMoving to microservices – a technology and organisation transformational journey
Moving to microservices – a technology and organisation transformational journey
 
Red Hat Container Strategy
Red Hat Container StrategyRed Hat Container Strategy
Red Hat Container Strategy
 
Docker Platform and Ecosystem Nov 2015
Docker Platform and Ecosystem Nov 2015Docker Platform and Ecosystem Nov 2015
Docker Platform and Ecosystem Nov 2015
 
The Story of SNCF Connect - biggest Flutter app in Europe (@FlutterHeroes 2023)
The Story of SNCF Connect - biggest Flutter app in Europe (@FlutterHeroes 2023)The Story of SNCF Connect - biggest Flutter app in Europe (@FlutterHeroes 2023)
The Story of SNCF Connect - biggest Flutter app in Europe (@FlutterHeroes 2023)
 
London-MuleSoft-Meetup-April-19-2023
London-MuleSoft-Meetup-April-19-2023London-MuleSoft-Meetup-April-19-2023
London-MuleSoft-Meetup-April-19-2023
 
Openshift: The power of kubernetes for engineers - Riga Dev Days 18
Openshift: The power of kubernetes for engineers - Riga Dev Days 18Openshift: The power of kubernetes for engineers - Riga Dev Days 18
Openshift: The power of kubernetes for engineers - Riga Dev Days 18
 
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
 
Net Devops Overview
Net Devops OverviewNet Devops Overview
Net Devops Overview
 
DevOps on Steroids Featuring Red Hat & Alantiss - Pop-up Loft Tel Aviv
DevOps on Steroids Featuring Red Hat & Alantiss - Pop-up Loft Tel AvivDevOps on Steroids Featuring Red Hat & Alantiss - Pop-up Loft Tel Aviv
DevOps on Steroids Featuring Red Hat & Alantiss - Pop-up Loft Tel Aviv
 
OSDC 2013 | The OpenNebula Cloud Platform for Datacenter Virtualization by Co...
OSDC 2013 | The OpenNebula Cloud Platform for Datacenter Virtualization by Co...OSDC 2013 | The OpenNebula Cloud Platform for Datacenter Virtualization by Co...
OSDC 2013 | The OpenNebula Cloud Platform for Datacenter Virtualization by Co...
 
Docker Birthday #5 Meetup Cluj - Presentation
Docker Birthday #5 Meetup Cluj - PresentationDocker Birthday #5 Meetup Cluj - Presentation
Docker Birthday #5 Meetup Cluj - Presentation
 
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
 
Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Ser...
Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Ser...Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Ser...
Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Ser...
 
introduction to micro services
introduction to micro servicesintroduction to micro services
introduction to micro services
 
Containers and Microservices for Realists
Containers and Microservices for RealistsContainers and Microservices for Realists
Containers and Microservices for Realists
 
Containers and microservices for realists
Containers and microservices for realistsContainers and microservices for realists
Containers and microservices for realists
 
OpenNebulaConf2018 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
OpenNebulaConf2018 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...OpenNebulaConf2018 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
OpenNebulaConf2018 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Último (20)

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

How we scale up our architecture and organization at Dailymotion

  • 1. from a french monolith to a worldwide platform: a human story
  • 2. Stan Chollet Chapter lead Core API Tribe Scale @ Dailymotion https://stan.life Président Association Orléans Tech Formateur Kubernetes & GraphQL
  • 3. 3 3billion video views per month 300million unique visitors per month 150million videos in our catalogue Dailymotion, one of the leading video destination platforms in the world
  • 4. OUR MISSION 4 transforming our video platform into a global destination for must-see videos. Building the best “go-to” experience where users can get their daily dose of must-see videos, and partners can leverage the latest tools to grow and monetise their audience.
  • 5. FROM MONOLITH TO SOA 5 Our road to micro-service architecture SOA • monolith LAMP Stack • hosted on bare-metal • mono-datacenter (PARIS) • REST API • fullstack website • geo-distributed • apps run in container (docker) • orchestrated on top of Kubernetes • multiple languages (mainly Python / Golang) • GraphQL API • fully API Centric TO
  • 6. GRAPHQL - AN ENABLER FOR OUR FRONTEND AND OUR BACKEND 6 FROM. TO. Monolith PHP Website HTML REST API GraphQL svc 1 python svc 2 golang svc 3 java
  • 7. GRAPHQL - AN ENABLER FOR OUR FRONTEND AND OUR BACKEND 7 TRIBES ? SQUADS ? Tribe Squad Squad Chapter Squad Chapter Tribe Squad Squa Chapter Chapter Squad
  • 8. GRAPHQL - AN ENABLER FOR OUR FRONTEND AND OUR BACKEND 8 SOA AS AN ORGANIZATIONAL ENABLER. SOA (geo-distributed) architecture GraphQL Data service User service Partner service Monolith (mono-datacenter) Website HTML REST API ownership product enabler ownership product tribes ownership mixed
  • 9. FIRST STEP 9 • Built & managed by one team (2 people) • Deployed in 3 regions on AWS • Orchestrated on top of kubernetes • Apps deployed with custom bash scripts • Good application monitoring • Poor infrastructure monitoring FROM SEPTEMBER 2016 TO JANUARY 2017. GraphQL REST Legacy PHP Search python Kubernetes on AWS FOUNDATIONS•
  • 10. SECOND STEP 10 TIME TO SCALE• FROM JANUARY 2017 TO JUNE 2017. People • from 2 to ~30 people. • from 1 to 5 teams Services • from 1 to ~15 services. • from 1 to ~10 languages / technologies Release • from an average of 1 deployment per day to more than 10
  • 11. HUMAN FIRST • Hired more than 30 people over a couple a months • Organised training sessions for newcomers • Optimised and reviewed our on-boarding process • Optimised the way to work on an SOA stack • Evangelised (GraphQL + Infrastructure) FROM 2 TO ~30 PEOPLE.
  • 12. • Only one dependency on the developer's laptop: docker • Simplify the technical on-boarding process • Simplify the project switching over our 500+ repositories • Use generic tasks name to launch code quality checks • Let developers use the technologies they want 12 make style make test make test-unit make test-functional make test-integration make complexity make run
  • 13. FROM AWS TO GCP 13 • Worldwide network (subnets can be routed from one region to another) • Ingress anycast IP, easy to setup • A hosted Kubernetes managed service with cool features such as node autoscaling • Connection to Dailymotion’s private network in Paris • Currently deployed in 3 regions across the world (~80 nodes) FROM 1 SERVICE TO 10 SERVICES.
  • 14. NEW HIGHLY SCALABLE HYBRID ARCHITECTURE 14 Geo-Distributed for high performance everywhere in the world Hybrid Infra on Premise together with Google Cloud Auto-scaling adapts to the audience Google Cloud POP On Premise POP CDN
  • 15. GIVE ROOT ACCESS TO DEVELOPERS 😎 15 • Implement continuous deployment
 (except production which needs human approval) • Let developers deploy by themselves • Delegate deployment workflow to developers through Jenkinsfile (Pipeline). • Enforce common interfaces, minimum code quality, deployment guidelines built by the devops team FROM 1 DEPLOYMENT PER DAYTO MORE THAN 10.
  • 16. WE ARE LEARNING FROM OUR MISTAKES 16 STEP #1:
 First we deployed our applications sequentially, region by region using bash scripts STEP #2:
 We wanted to manage our cluster from a single API endpoint : Federation Some API objects were missing in the Federation → mixed deployment methods : some objects in the Federation and others deployed region by region. STEP #3 (déjà-vu):
 Now, we’re deploying our applications sequentially region by region using Helm FROM 1 DEPLOYMENT PER DAYTO MORE THAN 10.
  • 17. CHARTS EVERYWHERE ! 17 • Manage dependencies between our applications. • Deploy a complete stack with a single command. • Help us to manage different environments/regions within a chart. • Easy to rollback: each deployment has a unique revision id • Ongoing : Provision a staging environment per pull request FROM 1 DEPLOYMENT PER DAYTO MORE THAN 10.
  • 18. FROM SLA 99,999% TO 99,9999999999999999999999999999999999% 18 • APM with Open Tracing Specification • Monitoring / Alerting • Logging Specification for each service • Feature Flipping, Progressive rollout, Experimentation (A/B) HOW WE OPERATE OUR PLATFORM?
  • 19. WE ARE NOT ROBOTS 19 BUILD. Software Engineer • Write code • Build applications which aren’t easy to operate SHIP. Release Engineer • Package & deploy applications RUN. System Engineer • Operate infrastructure & app • Unable to fix applications by themselves FROM SOFTWARE / SYSTEM ENGINEER TO PRODUCTION ENGINEER. BUILD / SHIP / RUN. Production Engineer • Can build applications • Package & deploy applications • Operate application in production • Build their applications with “RUN” mindset • Build tools for software engineers TO
  • 20. helm upgrade —install westeros —reuse-values —set imageTag=30610c5 dailymotion/westeros-gbased-raulicache BOOM ! WHAT: Bad parameter applied on helm command • 3 clusters emptied (~ 1 300 containers) • All our products were unusable AND: We were down during 19 minutes • ~10 minutes to be notified • ~7 minutes to understand • ~2 minutes to recover from scratch the entire architecture NOW: Grow up • Wrap destructive commands • Improve monitoring
  • 21. INFINITE AND BEYOND 21 • Hybrid architecture (on premises) • Stateful use cases: manage volume provisioning in the same way we orchestrate applications • Performance improvements (Service mesh) • Security: user authentication and auditing, secrets encryption. • Open Source our GraphQL Engine (Python, performance oriented) AND NOW ?