Automation is key to enable our incident responders to focus on high leverage decisions, provide a consistent experience to our internal customers, and ensure our team meets the needs of a rapidly growing Netflix. The SIRT team is investing a lot of engineering resources in automating Crisis Management, Incident Response, and Digital Forensics.
The following slides summarize the reasons why we have decided to invest in these areas, the technologies that we’re using to automate and orchestrate decisions that don’t matter, and showcase some of the workflows that we have successfully automated.
2. Us.
Members of the Security
Incident Response Team
(SIRT)
Kevin Glisson
Senior Security Engineer
kglisson@netflix.com
Marc Vilanova
Senior Security Engineer
mvilanova@netflix.com
3. About Netflix.
Teams and individual contributors are given a high degree of freedom
● Ownership of entire of stack
● Central teams provide “paved roads”
A lot of everything
● Environments
● Technologies
4. Automation.
Focus on high leverage decisions
● Aggressively eliminate decisions that don’t
matter
Consistency is key
● Builds confidence (for everyone)
● Breeds familiarity
6. People Resolve Incidents.
We need help; quickly
● Who do I contact? How do I contact them?
Provide known communication channels
● What is this new message, can ignore it? Should I pull the car over?
Set clear expectations
● Why am I here? What do you need me to do?
7. Incident Ramp.
Getting people engaged and oriented
● Similar to other product based approaches
Leverage existing knowledge and workflows (go to where your customers are)
● In stressful situations, muscle memory is key
21. ● Spinnaker pipeline that builds and publishes LiME modules to our artifactory
● Triggers on every unstable foundation AMI build
Memory Forensics
Acquisition
25. Technologies
● Python + Volatility Framework (as a library) = sirt-mem-analysis
○ Allows us to run a set of plugins 6x faster than via command line
Work-in-progress / Future work
● Explore Rekall as an alternative to Volatility
● Explore Titus¹ for parallelizing analysis
Memory Forensics
Analysis
¹ Netflix Cloud Container Runtime Platform
26. Technologies
● Python + Volatility Framework (as a library)
○ Allows us to run a set of plugins 6x faster than via command line
Work-in-progress / Future work
● Explore Rekall as an alternative to Volatility
● Explore Titus¹ for parallelizing analysis
Memory Forensics
Analysis
¹ Netflix Cloud Container Runtime Platform
28. ● Delegation wins the day
○ Through communication with peers/SMEs
○ Through automation
● There is no one “solution”
○ Organizations are radically different; remove decisions empower people.
Key Takeaways