SlideShare uma empresa Scribd logo
1 de 23
Baixar para ler offline
Winston
Diagnostic and Remediation Engineering (DaRE)
Vinay Shah & Jean-Sebastien Jeannotte
● Introduction
● Internals - How it works?
● Demo - See it in action!
● Learnings and challenges
● Metrics & Road ahead
● Additional resources
Topics
Introduction
Landscape
Operational load vs.
new features
Scale and Growth Availability
Application or
Service
Monitoring
Alerting
Pagerduty Email Winston
● Reduce MTTR
● Reduce risk of human errors
● Reduce pager fatigue, provide tier 1 support
● Don’t worry about infrastructure, focus on your business logic
● Best practice for runbook lifecycle management
Business goals
Winston is an event driven runbook
automation platform. It is designed to host
and execute runbooks in response to
operational events.
Internals
Howisitdeployed?
Execution Flow
● One stop portal for all things Winston
● Supports Create, Read, Update, Delete, Execute and Diagnose functionality
● Implements best practises
○ Compliance/Auditing
○ Persistence
○ Security (Authentication/Authorization)
● Self serve & scalable
Winston Studio
● Pack
A group of related automations typically organized around a discreet
service or product
● Action
Set of steps to help with diagnostics or remediations written as code
● Event & event source
External services that are the source of events that trigger a runbook
Terminology
Demo
Winston Studio
DEMO
● False positives
○ Cassandra ring health
● Diagnostics - correlation could point towards causation - e.g:
○ Querying Chronos events
○ Querying dependencies upstream and downstream for anomalous behaviour
● Remediation
○ Clean up disk space
○ Restart Kafka process
Sample use cases
Learnings &
challenges
Common patterns
● Usage
○ Culture of automating the manual and repeatable
○ Noisy signals become more interesting
○ Lesser the control more the opportunity
● Product
○ Safety is crucial
○ Usability is important
○ Resiliency
Insights
● Don’t reinvent the wheel
● Start simple and iterate
● Allow experimentation
● Pay special care to usability of your product
● Push for changing the culture - usage will follow
● Talk to us/others who have gone through some of the pains and learnings
Recommendations to get started
Metrics and
Road ahead
● Adoption. Adoption. Adoption.
● Usability
○ Polyglot support (Groovy based actions)
○ Deeper Integrations
● Safety
○ Resource isolation (Containers)
○ Rate limiting
The road ahead
● Introducing Winston:
http://techblog.netflix.com/2016/08/introducing-winston-event-driven.html
● Stackstorm: https://docs.stackstorm.com/
● Reach out: vshah@netflix.com or jjeannotte@netflix.com
We are hiring
Senior Software Engineer - https://jobs.netflix.com/jobs/860752
Links & resources
Thank you.

Mais conteúdo relacionado

Mais procurados

State transfer With Galera
State transfer With GaleraState transfer With Galera
State transfer With GaleraMydbops
 
A Forgotten HTTP Invisibility Cloak
A Forgotten HTTP Invisibility CloakA Forgotten HTTP Invisibility Cloak
A Forgotten HTTP Invisibility CloakSoroush Dalili
 
44CON 2014 - Meterpreter Internals, OJ Reeves
44CON 2014 - Meterpreter Internals, OJ Reeves44CON 2014 - Meterpreter Internals, OJ Reeves
44CON 2014 - Meterpreter Internals, OJ Reeves44CON
 
Caching solutions with Redis
Caching solutions   with RedisCaching solutions   with Redis
Caching solutions with RedisGeorge Platon
 
Swagger With REST APIs.pptx.pdf
Swagger With REST APIs.pptx.pdfSwagger With REST APIs.pptx.pdf
Swagger With REST APIs.pptx.pdfKnoldus Inc.
 
Workshop Spring - Session 1 - L'offre Spring et les bases
Workshop Spring  - Session 1 - L'offre Spring et les basesWorkshop Spring  - Session 1 - L'offre Spring et les bases
Workshop Spring - Session 1 - L'offre Spring et les basesAntoine Rey
 
Proxysql use case scenarios fosdem17
Proxysql use case scenarios    fosdem17Proxysql use case scenarios    fosdem17
Proxysql use case scenarios fosdem17Alkin Tezuysal
 
Introduction to Apache Camel
Introduction to Apache CamelIntroduction to Apache Camel
Introduction to Apache CamelClaus Ibsen
 
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdf
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdfProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdf
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdfJesmar Cannao'
 
The secret life of a dispatcher (Adobe CQ AEM)
The secret life of a dispatcher (Adobe CQ AEM)The secret life of a dispatcher (Adobe CQ AEM)
The secret life of a dispatcher (Adobe CQ AEM)Venugopal Gummadala
 
Building a REST Service in minutes with Spring Boot
Building a REST Service in minutes with Spring BootBuilding a REST Service in minutes with Spring Boot
Building a REST Service in minutes with Spring BootOmri Spector
 
Rest API with Swagger and NodeJS
Rest API with Swagger and NodeJSRest API with Swagger and NodeJS
Rest API with Swagger and NodeJSLuigi Saetta
 
(참고) Elk stack 설치 및 kafka
(참고) Elk stack 설치 및 kafka(참고) Elk stack 설치 및 kafka
(참고) Elk stack 설치 및 kafkaNoahKIM36
 
Thick Application Penetration Testing: Crash Course
Thick Application Penetration Testing: Crash CourseThick Application Penetration Testing: Crash Course
Thick Application Penetration Testing: Crash CourseScott Sutherland
 
Spring Boot and REST API
Spring Boot and REST APISpring Boot and REST API
Spring Boot and REST API07.pallav
 

Mais procurados (20)

State transfer With Galera
State transfer With GaleraState transfer With Galera
State transfer With Galera
 
Running Galera Cluster on Microsoft Azure
Running Galera Cluster on Microsoft AzureRunning Galera Cluster on Microsoft Azure
Running Galera Cluster on Microsoft Azure
 
A Forgotten HTTP Invisibility Cloak
A Forgotten HTTP Invisibility CloakA Forgotten HTTP Invisibility Cloak
A Forgotten HTTP Invisibility Cloak
 
44CON 2014 - Meterpreter Internals, OJ Reeves
44CON 2014 - Meterpreter Internals, OJ Reeves44CON 2014 - Meterpreter Internals, OJ Reeves
44CON 2014 - Meterpreter Internals, OJ Reeves
 
Caching solutions with Redis
Caching solutions   with RedisCaching solutions   with Redis
Caching solutions with Redis
 
Swagger With REST APIs.pptx.pdf
Swagger With REST APIs.pptx.pdfSwagger With REST APIs.pptx.pdf
Swagger With REST APIs.pptx.pdf
 
Swagger UI
Swagger UISwagger UI
Swagger UI
 
Workshop Spring - Session 1 - L'offre Spring et les bases
Workshop Spring  - Session 1 - L'offre Spring et les basesWorkshop Spring  - Session 1 - L'offre Spring et les bases
Workshop Spring - Session 1 - L'offre Spring et les bases
 
Proxysql use case scenarios fosdem17
Proxysql use case scenarios    fosdem17Proxysql use case scenarios    fosdem17
Proxysql use case scenarios fosdem17
 
Introduction to Apache Camel
Introduction to Apache CamelIntroduction to Apache Camel
Introduction to Apache Camel
 
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdf
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdfProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdf
ProxySQL and the Tricks Up Its Sleeve - Percona Live 2022.pdf
 
The secret life of a dispatcher (Adobe CQ AEM)
The secret life of a dispatcher (Adobe CQ AEM)The secret life of a dispatcher (Adobe CQ AEM)
The secret life of a dispatcher (Adobe CQ AEM)
 
Building a REST Service in minutes with Spring Boot
Building a REST Service in minutes with Spring BootBuilding a REST Service in minutes with Spring Boot
Building a REST Service in minutes with Spring Boot
 
Rest API with Swagger and NodeJS
Rest API with Swagger and NodeJSRest API with Swagger and NodeJS
Rest API with Swagger and NodeJS
 
(참고) Elk stack 설치 및 kafka
(참고) Elk stack 설치 및 kafka(참고) Elk stack 설치 및 kafka
(참고) Elk stack 설치 및 kafka
 
Spring Boot
Spring BootSpring Boot
Spring Boot
 
Spring mvc
Spring mvcSpring mvc
Spring mvc
 
Thick Application Penetration Testing: Crash Course
Thick Application Penetration Testing: Crash CourseThick Application Penetration Testing: Crash Course
Thick Application Penetration Testing: Crash Course
 
Building Advanced XSS Vectors
Building Advanced XSS VectorsBuilding Advanced XSS Vectors
Building Advanced XSS Vectors
 
Spring Boot and REST API
Spring Boot and REST APISpring Boot and REST API
Spring Boot and REST API
 

Semelhante a Winston - Netflix's event driven auto remediation and diagnostics tool

The Final Frontier, Automating Dynamic Security Testing
The Final Frontier, Automating Dynamic Security TestingThe Final Frontier, Automating Dynamic Security Testing
The Final Frontier, Automating Dynamic Security TestingMatt Tesauro
 
Webinar: Estrategias para optimizar los costos de testing
Webinar: Estrategias para optimizar los costos de testingWebinar: Estrategias para optimizar los costos de testing
Webinar: Estrategias para optimizar los costos de testingFederico Toledo
 
Agile Testing Analytics
Agile Testing AnalyticsAgile Testing Analytics
Agile Testing AnalyticsQASymphony
 
Testing in a continuous delivery environment
Testing in a continuous delivery environmentTesting in a continuous delivery environment
Testing in a continuous delivery environmentStefan Verhoeff
 
Demise of test scripts rise of test ideas
Demise of test scripts rise of test ideasDemise of test scripts rise of test ideas
Demise of test scripts rise of test ideasRichard Robinson
 
Managing software projects & teams effectively
Managing software projects & teams effectivelyManaging software projects & teams effectively
Managing software projects & teams effectivelyAshutosh Agarwal
 
Dscrum
DscrumDscrum
Dscrumsilexc
 
AWS Well Architected Framework in Summary
AWS Well Architected Framework in SummaryAWS Well Architected Framework in Summary
AWS Well Architected Framework in SummaryEwere Diagboya
 
Agile Development Practices May 2017
Agile Development Practices May 2017Agile Development Practices May 2017
Agile Development Practices May 2017Jaroslav Gergic
 
Using SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterpriseUsing SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterpriseChristian McHugh
 
Services, tools & practices for a software house
Services, tools & practices for a software houseServices, tools & practices for a software house
Services, tools & practices for a software houseParis Apostolopoulos
 
Metrics 4 faster feedback
Metrics 4 faster feedbackMetrics 4 faster feedback
Metrics 4 faster feedbackKris Buytaert
 
Drools5 Community Training Module 5 Drools BLIP Architectural Overview + Demos
Drools5 Community Training Module 5 Drools BLIP Architectural Overview + DemosDrools5 Community Training Module 5 Drools BLIP Architectural Overview + Demos
Drools5 Community Training Module 5 Drools BLIP Architectural Overview + DemosMauricio (Salaboy) Salatino
 
Making Sites Reliable (как сделать систему надежной) (Павел Уваров, Андрей Та...
Making Sites Reliable (как сделать систему надежной) (Павел Уваров, Андрей Та...Making Sites Reliable (как сделать систему надежной) (Павел Уваров, Андрей Та...
Making Sites Reliable (как сделать систему надежной) (Павел Уваров, Андрей Та...Ontico
 
Usa prácticas de integración continua y sobrevive para luchar otro día.
 Usa prácticas de integración continua y sobrevive para luchar otro día. Usa prácticas de integración continua y sobrevive para luchar otro día.
Usa prácticas de integración continua y sobrevive para luchar otro día.Software Guru
 
3 Keys to Performance Testing at the Speed of Agile
3 Keys to Performance Testing at the Speed of Agile3 Keys to Performance Testing at the Speed of Agile
3 Keys to Performance Testing at the Speed of AgileNeotys
 
CodeScience Webinar - Automated Testing for Your Salesforce App — Tips and Tr...
CodeScience Webinar - Automated Testing for Your Salesforce App — Tips and Tr...CodeScience Webinar - Automated Testing for Your Salesforce App — Tips and Tr...
CodeScience Webinar - Automated Testing for Your Salesforce App — Tips and Tr...CodeScience
 

Semelhante a Winston - Netflix's event driven auto remediation and diagnostics tool (20)

The Final Frontier, Automating Dynamic Security Testing
The Final Frontier, Automating Dynamic Security TestingThe Final Frontier, Automating Dynamic Security Testing
The Final Frontier, Automating Dynamic Security Testing
 
Webinar: Estrategias para optimizar los costos de testing
Webinar: Estrategias para optimizar los costos de testingWebinar: Estrategias para optimizar los costos de testing
Webinar: Estrategias para optimizar los costos de testing
 
Agile Testing Analytics
Agile Testing AnalyticsAgile Testing Analytics
Agile Testing Analytics
 
Agile testing (n)
Agile testing (n)Agile testing (n)
Agile testing (n)
 
Testing in a continuous delivery environment
Testing in a continuous delivery environmentTesting in a continuous delivery environment
Testing in a continuous delivery environment
 
Demise of test scripts rise of test ideas
Demise of test scripts rise of test ideasDemise of test scripts rise of test ideas
Demise of test scripts rise of test ideas
 
Managing software projects & teams effectively
Managing software projects & teams effectivelyManaging software projects & teams effectively
Managing software projects & teams effectively
 
Dscrum
DscrumDscrum
Dscrum
 
AWS Well Architected Framework in Summary
AWS Well Architected Framework in SummaryAWS Well Architected Framework in Summary
AWS Well Architected Framework in Summary
 
Agile Development Practices May 2017
Agile Development Practices May 2017Agile Development Practices May 2017
Agile Development Practices May 2017
 
Pusheando en master, que es gerundio
Pusheando en master, que es gerundioPusheando en master, que es gerundio
Pusheando en master, que es gerundio
 
Using SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterpriseUsing SaltStack to DevOps the enterprise
Using SaltStack to DevOps the enterprise
 
Services, tools & practices for a software house
Services, tools & practices for a software houseServices, tools & practices for a software house
Services, tools & practices for a software house
 
Software management for tech startups
Software management for tech startupsSoftware management for tech startups
Software management for tech startups
 
Metrics 4 faster feedback
Metrics 4 faster feedbackMetrics 4 faster feedback
Metrics 4 faster feedback
 
Drools5 Community Training Module 5 Drools BLIP Architectural Overview + Demos
Drools5 Community Training Module 5 Drools BLIP Architectural Overview + DemosDrools5 Community Training Module 5 Drools BLIP Architectural Overview + Demos
Drools5 Community Training Module 5 Drools BLIP Architectural Overview + Demos
 
Making Sites Reliable (как сделать систему надежной) (Павел Уваров, Андрей Та...
Making Sites Reliable (как сделать систему надежной) (Павел Уваров, Андрей Та...Making Sites Reliable (как сделать систему надежной) (Павел Уваров, Андрей Та...
Making Sites Reliable (как сделать систему надежной) (Павел Уваров, Андрей Та...
 
Usa prácticas de integración continua y sobrevive para luchar otro día.
 Usa prácticas de integración continua y sobrevive para luchar otro día. Usa prácticas de integración continua y sobrevive para luchar otro día.
Usa prácticas de integración continua y sobrevive para luchar otro día.
 
3 Keys to Performance Testing at the Speed of Agile
3 Keys to Performance Testing at the Speed of Agile3 Keys to Performance Testing at the Speed of Agile
3 Keys to Performance Testing at the Speed of Agile
 
CodeScience Webinar - Automated Testing for Your Salesforce App — Tips and Tr...
CodeScience Webinar - Automated Testing for Your Salesforce App — Tips and Tr...CodeScience Webinar - Automated Testing for Your Salesforce App — Tips and Tr...
CodeScience Webinar - Automated Testing for Your Salesforce App — Tips and Tr...
 

Último

Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfryanfarris8
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfVishalKumarJha10
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 

Último (20)

Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 

Winston - Netflix's event driven auto remediation and diagnostics tool

  • 1. Winston Diagnostic and Remediation Engineering (DaRE) Vinay Shah & Jean-Sebastien Jeannotte
  • 2. ● Introduction ● Internals - How it works? ● Demo - See it in action! ● Learnings and challenges ● Metrics & Road ahead ● Additional resources Topics
  • 4. Landscape Operational load vs. new features Scale and Growth Availability
  • 6. ● Reduce MTTR ● Reduce risk of human errors ● Reduce pager fatigue, provide tier 1 support ● Don’t worry about infrastructure, focus on your business logic ● Best practice for runbook lifecycle management Business goals
  • 7. Winston is an event driven runbook automation platform. It is designed to host and execute runbooks in response to operational events.
  • 11. ● One stop portal for all things Winston ● Supports Create, Read, Update, Delete, Execute and Diagnose functionality ● Implements best practises ○ Compliance/Auditing ○ Persistence ○ Security (Authentication/Authorization) ● Self serve & scalable Winston Studio
  • 12. ● Pack A group of related automations typically organized around a discreet service or product ● Action Set of steps to help with diagnostics or remediations written as code ● Event & event source External services that are the source of events that trigger a runbook Terminology
  • 13. Demo
  • 15. ● False positives ○ Cassandra ring health ● Diagnostics - correlation could point towards causation - e.g: ○ Querying Chronos events ○ Querying dependencies upstream and downstream for anomalous behaviour ● Remediation ○ Clean up disk space ○ Restart Kafka process Sample use cases
  • 18. ● Usage ○ Culture of automating the manual and repeatable ○ Noisy signals become more interesting ○ Lesser the control more the opportunity ● Product ○ Safety is crucial ○ Usability is important ○ Resiliency Insights
  • 19. ● Don’t reinvent the wheel ● Start simple and iterate ● Allow experimentation ● Pay special care to usability of your product ● Push for changing the culture - usage will follow ● Talk to us/others who have gone through some of the pains and learnings Recommendations to get started
  • 21. ● Adoption. Adoption. Adoption. ● Usability ○ Polyglot support (Groovy based actions) ○ Deeper Integrations ● Safety ○ Resource isolation (Containers) ○ Rate limiting The road ahead
  • 22. ● Introducing Winston: http://techblog.netflix.com/2016/08/introducing-winston-event-driven.html ● Stackstorm: https://docs.stackstorm.com/ ● Reach out: vshah@netflix.com or jjeannotte@netflix.com We are hiring Senior Software Engineer - https://jobs.netflix.com/jobs/860752 Links & resources