8. How does Netflix Streaming work?
Netflix
Services
in Amazon Cloud
Your CE Device
CDN
9. Device Under the Hood
Netflix
Services
in Amazon Cloud
Your CE Device
CDN
User Interface
Netflix Streaming Platform
DRM encodingCE integration
10. User Interface loaded, data retrieved from
Netflix Edge Service
User Interface
Netflix Streaming Platform
DRM
Netflix
Services
in Amazon Cloud
encoding
Your CE Device
CDN
CE integration
Edge
Services
11. User Interface loaded, data retrieved from
Netflix Edge Service
User Interface
Netflix Streaming Platform
DRM
Netflix
Services
in Amazon Cloud
encoding
Your CE Device
CDN
CE integration
Edge
Services
12. User Interface Loaded
User Interface
Netflix Streaming Platform
DRM
Netflix
Services
in Amazon Cloud
encoding
Your CE Device
CDN
CE integration
Edge
Services
15. Obtaining License
User Interface
Netflix Streaming Platform
DRM
Netflix
Services
in Amazon Cloud
encoding
Your CE Device
CDN
CE integration
Edge
Services
License
16. Movie starts streaming
User Interface
Netflix Streaming Platform
DRM
Netflix
Services
in Amazon Cloud
encoding
Your CE Device
CDN
CE integration
Edge
Services
PlayData
17. Movie starts streaming
User Interface
Netflix Streaming Platform
DRM
Netflix
Services
in Amazon Cloud
encoding
Your CE Device
CDN
CE integration
Edge
Services
18. Periodic âbookmarkâ calls note place in
movie
User Interface
Netflix Streaming Platform
DRM
Netflix
Services
in Amazon Cloud
encoding
Your CE Device
CDN
CE integration
Edge
Services
bookmark
19. Edge Services - What we are talking about
today
User Interface
Netflix Streaming Platform
DRM encoding
Your CE Device
CDN
CE integration
bookmarkNetflix
Services
in Amazon Cloud Edge Services
20. Edgeâs lofty mission
â High Availability
â Good performance
â Data broker between many services and
devices in a global, high volume, rapidly
innovating, highly dynamic service
â Clients and services are constantly changing
21. Edge stats
â Billions of incoming requests per day
â Over 10X outgoing service calls per request
â About 10 device changes per day
â Daily service pushes
â Daily routing changes
26. What is Zuul?
â Open source framework for dynamically
reading, writing, and executing filters that act on
incoming HTTP requests
â Dynamically compiled filters written in Groovy
â Any JVM language supported
â Filters share state through a request scoped
context
27. How we use Zuul
â Authentication
â Insights
â Stress Testing
â Canary Testing
â Dynamic Routing
â Service Migration
â Load Shedding
â Security
â Static Response handling
â Active/Active traffic management
59. Edge Scripting Tier
â Device teams write scripts for their device
â control content, format, endpoints
â Code injected directly into Edge Service at
runtime
â Scripts are in production in about 30 seconds
67. Purpose of the Service Layer
â Interface to business logic (our API)
â Shield data consumers from service
changes
â Combine and expose business data in a
logical and consistent manner
â All Service Layer methods are async using
RxJava
â Hides concurrency and underlying implementation
70. RxJava
â Why?
â How do you expose an async service as an API?
â Solution to compose async flows and sequences of
data
â Rich set of operators to filter and interact with data
71. How RxJava Helps
â Need to hide concurrency from script writers
â Minimize the âbad thingsâ consumers of our API on
box can do.
â Hide the internal implementation
â Change concurrency of any given call
â Switch to non-blocking IO
81. How Hystrix helps
â Latency and Fault Tolerance
â Stop cascading failures. Fallbacks and graceful degradation. Fail fast and rapid recovery.
â Thread and semaphore isolation with circuit breakers.
â Realtime Operations
â Realtime monitoring and configuration changes. Watch service and property changes take effect
immediately as they spread across a fleet.
â Be alerted, make decisions, affect change and see results in seconds.
â Concurrency
â Parallel execution. Concurrency aware request caching. Automated batching through request collapsing.
100. Scryer - Predictive auto-scaling
â Why?
â Reactive doesnât work in all cases
â Reacting is sometimes too late
â Sunday morning cartoons
â Reactive overreacts
â Superbowl, World Cup, Outages
â Fixed size scaling
â All in All - more reliable and saves money
108. Other Scryer Factors
â Traffic volume analysis
â At least 4 weeks of data
â Linear regression based on time of day
â Correct the prediction based on todayâs trend.
â Instance factors
â Instance startup time
â Instance capacity (obtained by squeeze testing)
â Scale (up/down) actions scheduled based on prediction
124. â Code failure - Continuous delivery
â Service failure - fallbacks and redundancy
â Instances and Zone failure - redundancy
â Cloud infrastructure failure - Multiple active regions
â Human failure - Automation
Building for Failure
125. Drawbacks of the cloud
â Some failures are difficult to detect the cause
â Huge variability in instance performance that are
almost impossible to explain.
â Network barriers
â Multi tenancy
â Firewalls
â Very limited access to information/ ability to fix issues
127. Netflix Culture - Our secret sauce
â Freedom and responsibility
â Highly aligned teams
â Aversion to process
â Design for necessity
â Design for failure
â Engineering teams operating their services
128. Netflix OSS
â Zuul - Smart edge router
â RxJava - Functional reactive libraries
â Hystrix - SOA resiliency
â + a lot more!
129. For more Info on Netflix Cloud Technology:
Read our Technology Blog : http://techblog.netflix.com/
Check out our Open Source Cloud Projects : http://netflix.github.io