Introduction to the Netflix Open Source Software project, explains why Netflix is doing this, how all the parts fit together and what is planned to come next. Presented at the inaugural NetflixOSS Meetup February 6th 2013 at Netflix headquarters in Los Gatos.
9. Three Questions
Why is Netflix doing this?
How does it all fit together?
What is coming next?
10. Netflix Deconstructed
Content as a Service on a Platform
Long term strategic Easy to use
barriers to competition
Personalized
Service
Exclusive Agile, Reliable
and Scalable, Secure
Extensive Low cost, Global
Content Platform
Enables the business,
but doesn’t
Netflix differentiate against
large competitors
11. Platform Investment
Cloud platform
Netflix By leveraging
AWS we can
adapts as new concentrate
AWS services investment in
appear Cloud Platform Netflix unique
services
Light Lifting
Amazon Web Services
Undifferentiated Heavy Lifting
12. Platform Evolution
2009-2010 2011-2012 2013-2014
Bleeding Edge Common Shared
Innovation Pattern Pattern
Netflix ended up several years ahead of the
industry, but it’s not a sustainable position
13. Grasping an Opportunity
One of many Common
diverse platforms platform
NetflixOSS could
If Netflix makes the
become a
NetflixOSS platform
significant platform
easier to adopt…
ecosystem
14. Making it easy to follow
Exploring the wild west each time vs. laying down a shared route
15. Establish our Hire, Retain and
solutions as Best Engage Top
Practices / Standards Engineers
Goals
Build up Netflix Benefit from a
Technology Brand shared ecosystem
18. Open Source Projects
Legend
Github / Techblog Priam Exhibitor
Servo and Autoscaling Scripts
Apache Contributions
Cassandra as a Service Zookeeper as a Service
Astyanax Curator Genie
Techblog Post
Cassandra client for Java Zookeeper Patterns Hadoop PaaS
Coming Soon
CassJMeter EVCache Hystrix
Cassandra test suite Memcached as a Service Robust service pattern
Cassandra
Eureka / Discovery
Multi-region EC2 datastore RxJava Reactive Patterns
support Service Directory
Asgard
Aegisthus Archaius
AutoScaleGroup based AWS
Hadoop ETL for Cassandra Dynamics Properties Service console
Edda Chaos Monkey
Explorers
Config state with history Robustness verification
Governator
Denominator
Library lifecycle and dependency Latency Monkey
injection (Announce today)
Odin Ribbon
Janitor Monkey
Orchestration for Asgard REST Client + mid-tier LB
Karyon
Blitz4j Async logging Bakeries and AMI
Instrumented REST Base Server
19. NetflixOSS Continuous Build and Deployment
Github Maven AWS
NetflixOSS Central Base AMI
Source
Dynaslave
Jenkins AWS
AWS Build
Bakery Baked AMIs
Slaves
Odin Asgard AWS
Orchestration (+ Frigga) Account
API Console
20. NetflixOSS Services Scope
AWS Account
Asgard Console
Archaius Config
Multiple AWS Regions
Service
Cross region
Priam C* Eureka Registry
Explorers
Dashboards
Exhibitor ZK
3 AWS Zones
Application
Priam Evcache
Atlas Edda History Clusters
Cassandra Memcached
Monitoring Autoscale Groups
Persistent Storage Ephemeral Storage
Instances
Simian Army
Genie Hadoop
Services
21. NetflixOSS Instance Libraries
• Baked AMI – Tomcat, Apache, your code
Initialization • Governator – Guice based dependency injection
• Archaius – dynamic configuration properties client
• Eureka - service registration client
Service • Karyon - Base Server for inbound requests
• RxJava – Reactive pattern
• Hystrix/Turbine – dependencies and real-time status
Requests • Ribbon - REST Client for outbound calls
• Astyanax – Cassandra client and pattern library
Data Access • Evcache – Zone aware Memcached client
• Curator – Zookeeper patterns
• Blitz4j – non-blocking logging
Logging • Servo – metrics export for autoscaling
• Atlas – high volume instrumentation
22. NetflixOSS Testing and Automation
Test Tools • CassJmeter – load testing for C*
• Circus Monkey – test rebalancing
• Janitor Monkey
Maintenance • Efficiency Monkey
• Doctor Monkey
• Howler Monkey
• Chaos Monkey - Instances
Availability • Chaos Gorilla – Availability Zones
• Chaos Kong - Regions
• Latency Monkey – latency and error injection
Security • Security Monkey
• Conformity Monkey
23. What’s Coming Next?
Better portability
Higher availability
More
Features Easier to deploy
Contributions from end users
Contributions from vendors
More Use Cases
24. SPoF - Single Point of Failure
Failure to be Scalable?
Failure to be Portable?
Failure to be Functional?
25. Optimizing for SPoF
Cloud platforms, Eucalyptus, Cloudstack, OpenStack etc.
Playing feature catch-up with AWS, “Next version has X…”
Eventually
Functional
Scalable
Portable
or
Functional
Eventually Eventually
Portable Scalable
NetflixOSS CloudFoundry
Already scalable and functional MongoDB etc.
“Next version is more portable…” “Next version scales…”
26.
27. Managing Multi-Region Availability
AWS DynECT
Route53 UltraDNS DNS
Regional Load Balancers Regional Load Balancers
Zone A Zone B Zone C Zone A Zone B Zone C
Cassandra Replicas Cassandra Replicas Cassandra Replicas Cassandra Replicas Cassandra Replicas Cassandra Replicas
What we need is a portable way to manage multiple DNS providers….
28. Denominator
“The next version is more portable…” for DNS
Edda, Multi-
Use Cases Region
Failover
Common Model Denominator
DNS Vendor Plug-in AWS Route53 DynECT UltraDNS Etc…
API Models (varied IAM Key Auth User/pwd User/pwd
and mostly broken) REST REST SOAP
Currently being built by Adrian Cole (the jClouds guy, he works for Netflix now…)
29. Announcements
• Denominator
– New! Discuss it with Adrian Cole here today
– Techblog and github coming soon
• Next NetflixOSS Meetup
– Five weeks from today, March 13th, at Netflix
– More & bigger announcements, don’t miss it…
30. Functionality and scale now, portability coming
Moving from parts to a platform in 2013
Netflix is fostering an ecosystem
Rapid Evolution - Low MTBIAMSH
(Mean Time Between Idea And Making Stuff Happen)
31. Next - Lightning Talks
Please save your questions for the in
person sessions that follow
Notas do Editor
The genre box shots were chosen because we have rights to use them, we are starting to make specific logos for each project going forward.
Content, delivered by a service running on a platform. However our much large competitors also have the same platform advantages.
Over time, AWS adds more features, and our platform layer shrinks and adapts to leverage the new features so that we can concentrate investment in long term Netflix differentiators, and leverage the AWS investment to stay ahead of Amazon retail.
When Netflix first moved to cloud it was bleeding edge innovation, we figured stuff out and made stuff up from first principles. Over the last two years more large companies have moved to cloud, and the principles, practices and patterns have become better understood and adopted. At this point there is intense interest in how Netflix runs in the cloud, and several forward looking organizations adopting our architectures and starting to use some of the code we have shared. Over the coming years, we want to make it easier for people to share the patterns we use.
By sharing Netflix platform, if we can build a large ecosystem around that platform then we get the benefits of larger scale more quickly, being able to hire developers who are already familiar with the platform, and shared development of the platform itself with other organization.
The railroad made it possible for California to be developed quickly, by creating an easy to follow path we can create a much bigger ecosystem around the Netflix platform
We have shared parts of our platform bit by bit through the year, it’s starting to get traction now