Podila mesos con europe keynote aug sep 2016

Elastic Resource Scheduling
with Apache Mesos
Sharma Podila
Aug 31st, MesosCon Europe 2016

themWisely.
Finite.Let’suse
Computing Resources are

themWisely.Let’sschedule
Finite.Let’suse
themoptimally.

About Me
● Software engineer
○ Netflix Edge Engineering
○ Sun Microsystems + Oracle Corp.
○ Resource scheduling, stream processing,
distributed systems
● Author of Fenzo scheduling library

● Why Apache Mesos?
● Why focus on scheduling?
● How to guarantee capacity for various apps?
● What’s needed from the container executor?
Let’s address a few questions

Source: https://www.sandvine.com/news/global_broadband_trends.asp
81 Million subscribers worldwide and growing!

Microservices architecture on EC2

Needed to build these...
Needle in a haystack anomaly detection

Needed to build these...
Needle in a haystack anomaly detection
Container deployment service for a
mix of batch and service workloads

Reactive stream processing: Mantis
Zuul
Cluster
API
Cluster
Mantis
Stream processing
Cloud native service
● Configurable message delivery guarantees
● Heterogeneous workloads
○ Real-time dashboarding, alerting
○ Anomaly detection, metric generation
○ Interactive exploration of streaming data
Anomaly
Detection

EC2
VPC
VMVM
TitusJob
Control
Containers
App
Cloud Platform
(metrics, IPC, health)
VMVM
Batch
Containers
Eureka Edda
Container deployment: Titus
Atlas &
Insight

A few common themes
Large variation in peak to trough resource
requirements
Mantis
events/sec
8M
2M
Titus
concurrent jobs
1000s
10s

A few common themes
Heterogeneous mix of jobs and resources
Resource Task request Agent sizes
CPU 1 - 32 CPUs 8 - 32 CPUs
Memory 2 - 200+ GB 32 - 244 GB
Network
bandwidth
10 - 1024 Mbps 1024 - 10240
Resource affinity based on task type
Task locality

A few common themes
Jobs needing high availability of tasks across ephemeral cloud
resources
Host1
ec2 zone=d
Host2
ec2 zone=e
Host3
ec2 zone=f
Job with N tasks

What kind of scheduler do I need?
Scheduler
Cluster wide optimizations:
#servers, heterogeneous mix, security
User centric
optimizations:
Resource affinity,
task locality
Assignments
Achieve multiple scheduling objectives

Functions of a framework
Framework
API
Resource
Scheduling
Persistence
Domain
specific
Environment
specific
Potentially
common

NetflixOSS Fenzo scheduling library
https://github.com/Netflix/Fenzo
● Heterogeneous mix of task and resource sizes
● Autoscaling of Mesos agent clusters
● Customizable scheduling objectives

Scheduling optimizations
Speed Accuracy
First fit assignment Optimal assignment
Real world tradeoffs

For each task
On each host
Validate hard constraints
Eval fitness and soft constraints
Until fitness “good enough”, and
A minimum #hosts evaluated
Fenzo Scheduling strategy
= Plugins
Sample plugins: bin packing fitness function and soft/hard constraint evaluators for resource
affinity and task locality

Fenzo agent cluster autoscaling
● Scaling up is relatively easy
● Scaling down requires bin packing
○ By resource footprint, runtime, etc.
Host 1 Host 2 Host 3 Host 4
vs.
Host 1 Host 2 Host 3 Host 4

Capacity guarantees
Guarantee capacity for timely job starts
Mesos support for quotas, etc. evolving^
Agreed upon

Capacity guarantees
Agreed upon
Generally, optimize throughput for batch jobs and start
latency for service jobs

Capacity guarantees
Agreed upon
Some service style jobs may be less important
Categorize by expected behavior instead:
Critical versus Flex (flexible scheduling requirements)
Generally, optimize throughput for batch jobs and start
latency for service jobs

Capacity guarantees
Critical
Flex
Quotas

Capacity guarantees
Critical
Flex
Critical
Flex
Resource
Allocation
Order
Quotas Prioritiesvs.

AppC1
AppC2
AppC3
AppCN
AppF1
AppF2
AppFN
AppF3
Resource
Allocation
Order
Capacity guarantees: hybrid view
Critical
Flex

Critical
Head of line blocking
What if ‘Critical’ task isn’t satisfied?
Or, it isn’t ready?
Flex

Dynamic scheduling
Critical
Flex

Automatic advance reservation
Task T2
Dynamic scheduling
T1 T2
HostA
Critical
Flex
Time

Task T2
Dynamic scheduling
T1 T2
HostA
Critical
Flex
Time
Underutilization

Task T2
Back filling improves utilization
Task T3
Dynamic scheduling
T1 T2
Time
T3
HostA
Critical
Flex

Capacity guarantees: “utilization”
What if ‘Critical’ is under utilizing?
Let Flex use it, but …
Critical
Flex

Capacity guarantees: “utilization”
What if ‘Critical’ is under utilizing?
Let Flex use it, but …
Preemptions
“Fairness” via composable functions
Critical
Flex

Container executor
+ <MULTI-TENANT

Container executor
+ <
Augment missing pieces:
IP per container
Security - Security Groups, IAM roles
Isolation for networking b/w, disk I/O
MULTI-TENANT

No IP Needed
Task 0
SecGrp Y
Task 1 Task 2 Task 3
docker0 (*)
EC2 VMeth0
eni0
SG=Titus Agent
eth1
eni1
SecGrp=X
eth2
eni2
SG=Y
IP 1
IP 2
IP 3
pod root
veth<id>
app
SecGrp X
pod root
veth<id>
app
SecGrp X
pod root
veth<id>
appapp
veth<id>
Linux Policy
Based Routing
EC2
Metadata
Proxy
169.254.169.254
IPTables NAT (*)
* **
169.254.169.254
Plumbing VPC Networking into Docker

themWisely.Let’sschedule
Finite.Let’suse
themoptimally.
And,let’scollaborate
^

Questions?
Elastic Resource Scheduling
with Apache Mesos
Sharma Podila spodila @ netflix . com
@podila linkedin.com/in/spodila

Podila mesos con europe keynote aug sep 2016

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (7)

Semelhante a Podila mesos con europe keynote aug sep 2016

Semelhante a Podila mesos con europe keynote aug sep 2016 (20)

Último

Último (20)

Podila mesos con europe keynote aug sep 2016