Mais conteúdo relacionado Semelhante a Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - StampedeCon 2015 (20) Apache Hadoop YARN – Multi-Tenancy, Capacity Scheduler & Preemption - StampedeCon 20151. Page 1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Multi-Tenancy & The Capacity Scheduler
Apache YARN
Joseph Niemiec
Senior Solutions Architect
jniemiec@hortonworks.com
ARN
2. Page 2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Quick Bio
• Hadoop User for ~4 years
• Co-Author for Apache Hadoop YARN
• Originally used Hadoop for location based services
• Destination Prediction
• Traffic Analysis
• Effects of weather at client locations on call center call types
• Pending Patent in Automotive/Telematics domain
• Defensive Paper on M2M Validation
• Started on analytics to be better at an MMORPG
• HWX SME for YARN, Tez & MapReduce
3. Page 3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Agenda
Multi-Tenancy, A Goal, A Definition
YARN Primer
Capacity Scheduler Basics
Workload Management
• Queue Mapping
• Node Labels
• Fair Sharing Preemption
• Chargeback
Resource Control
• Memory
• CPU & CGroups
• Future Resources
Quick Preemption Demo – If we have time
4. Page 4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Multi-Tenancy
A Goal, A Definition
5. Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Business Objectives of Multi-Tenancy
• Elimination of data silos
• Collate and share data across LoBs
• Lower cluster TCO through:
• Blending workloads
• Higher cluster utilization
• Economies of scale
• Enable applications to:
• Exploit 3rd party data sources
• Share LoB data with
– External customers; and
– Other LoBs
– Supply chain partners
Spring 2015
65% of clusters host
multiple workloads
Fall 2013
Largely silo’d deployments
with single workload clusters
6. Page 6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
YARN
Yet Another Resource Negotiator
7. Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Transition from Hadoop 1 to Hadoop 2
HADOOP 1.0
HDFS
(redundant,
reliable
storage)
MapReduce
(cluster
resource
management
&
data
processing)
HDFS2
(redundant,
reliable
storage)
YARN
(cluster
resource
management)
MapReduce
(data
processing)
Others
(data
processing)
HADOOP 2.0
Single Use System
Batch Apps
Multi Purpose Platform
Batch, Interactive, Online, Streaming
YARN-1
8. Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Concepts
Application
Application is a job submitted to the framework
Example – Map Reduce Job
Container
Basic unit of allocation
Fine-grained resource allocation across multiple resource types (memory, cpu, disk, network,
gpu etc.)
container_0 = 2GB, 1CPU
container_1 = 1GB, 6 CPU
Replaces the fixed map/reduce slots
9. Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
YARN what is it good for?
Compute for Data Processing
Metal Detectors for your Hay Stacks
Compute for Embarrassingly Parallel Problems
Problems with tiny datasets and/or that don’t depend on one another
ie: Exhaustive Search, Trade Simulations, Climate Models, Genetic Algorithms
Beyond MapReduce
Enables Multi Workload Compute Applications on a Single Shared Infrastructure
Stream Processing, NoSQL, Search, InMemory, Graphs, etc
!ANYTHING YOU CAN START FROM CLI!
10. Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
NodeManager
NodeManager
NodeManager
NodeManager
Container
1.1
Container
2.4
NodeManager
NodeManager
NodeManager
NodeManager
NodeManager
NodeManager
NodeManager
NodeManager
Container
1.2
Container
1.3
AM
1
Container
2.2
Container
2.1
Container
2.3
AM2
YARN Architecture - Walkthrough
Client2
ResourceManager
Scheduler
11. Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Capacity Scheduler
The Basics
12. Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
YARN Capacity Scheduler
• Elasticity over Queues
• Job submission Access Control Lists
Capacity
Sharing
FUNCTION
• Max capacity per queue
• User limits within queue
Capacity
Enforcement
FUNCTION
• Management Admin. Access Control Lists
• Capacity-Scheduler.xml
Admin-‐istraIon
FUNCTION
13. Page 13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hierarchical Queues
ResourceManager
Scheduler
root
Adhoc
10%
DW
60%
Mrkting
30%
Dev
10%
Reserved
20%
Prod
70%
Prod
80%
Dev
20%
P0
70%
P1
30%
14. Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Multi-Tenancy with Capacity Scheduler
Queues
Economics as queue-capacity
§ Hierarchical Queues
SLAs
§ Preemption
Resource Isolation
§ Cgroups
Administration
§ Queue ACLs
§ Run-time re-configuration for queues
§ Charge-back
ResourceManager
Scheduler
root
Adhoc
10%
DW
70%
Mrkting
20%
Dev
10%
Reserved
20%
Prod
70%
Prod
80%
Dev
20%
P0
70%
P1
30%
Capacity Scheduler
Hierarchical
Queues
15. Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Set Limits on Capacity
Minimum Capacity
Used to enforce SLAs across Business Units
Guaranteed minimum resources for the queue
Maximum Capacity
Hard limits on maximum % of cluster
resources
Resource Elasticity when not being used by
other queues
Minimum User Limits
Enforces sharing amongst users in a
Business Unit
User sharing for a given queue
User Limit Factor
Maximum queue capacity that one user can
take up
Application Limit
Maximum # of applications submitted to one
queue
16. Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
CS: Example Queue Configuration
Finance: 10 users | Ad-hoc BI Query jobs only | High Priority, Stricter SLAs
Data Warehouse: 2 users | Batch ETL and Report Generation jobs | Production SLAs
Marketing: 4 users | Ad-hoc Data Science (Pig+Mahout) | Loose SLAs
yarn.scheduler.capacity.root.finance
Capacity ACLs
Min: 0.30 | Max: 0.40 | User Limit: 0.20 ‘Finance’ group
yarn.scheduler.capacity.root.datawarehouse
Capacity ACLs
Min: 0.50 | Max: 0.60 | User Limit: 1.0 ‘DataWarehouse’ group
yarn.scheduler.capacity.root.markeIng
Capacity ACLs
Min: 0.20 | Max: 0.20 | User Limit: 1.0 ‘Marketing’ group
17. Page 17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Workload Management
Queue Mapping, Labels, Fair Sharing, Preemption, Chargeback
18. Page 18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Default User/Group to Queue Mapping
Yarn.scheduler.capacity.root.markeIng
Capacity ACLs
Min: 0.30 | Max: 0.50 ‘Web’ group
Yarn.scheduler.capacity.root.ops
Capacity ACLs
Min: 0.30 | Max: 0.50 ‘SupplyChain’ group
Yarn.scheduler.capacity.root.default
Capacity ACLs
Min: 0.40
User/Group CS Queue
U: Joe Marketing
G: Web Marketing
G: SupplyChain Ops
…
MR App
Joe
YARN-2411
19. Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Node Labels in YARN
Enable configuration of node partitions
Why
Need a mechanism to enforce node-level isolation
Account for resource contention amongst non-YARN managed resources
Account for hardware or software constraints
Two options:
Non-exclusive (Soft) Node Labels
Exclusive (Hard) Node Labels
20. Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Storm Storm
StormStorm
Exclusive Node Labels enable Isolated Partitions
S
App
Storm
Configure
Partitions
Storm
B
App
Exclusive Labels
enforce Isolation
S S
nodes
labels
S S
YARN-796
21. Page 21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Spark Spark
SparkSpark
Non-Exclusive Node Labels
S
App
Spark
Configure non-
exclusive labels
Spark
B
App
Schedule if free
capacity
S S
nodes
labels
S S
B
YARN-3214
22. Page 22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Fair Sharing: Pluggable Queue Policies
Choose scheduling policy per leaf queue
FIFO
Application Container requests accommodated on first come first serve basis
Multi-fair weight
Application Container requests accommodated according to:
• Order of least resources used – multiple applications make progress
• (Optional) Size based weight – adjustment to boost large applications making progress
YARN-3319YARN-3318
23. Page 23 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Fair Sharing
MarkeIng
QUEUE
- Job 1
- U: etl
- Q:
Marketing
Job1
Containers: 100
Running/
Finished: 20
Job1 - 20 Containers Running
MarkeIng
QUEUE
- Job 2
- U: etl
- Q:
Marketing
Job1
Containers: 100
Running/
Finished: 20
Job2
Containers: 20
Running/
Finished: 0
Job1 - 20 Containers Running
24. Page 24 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Fair Sharing
MarkeIng
QUEUE
Job1 – 10
Containers Running
MarkeIng
QUEUE
Job1
Containers: 100
Running/
Finished: 40
Job2
Containers: 20
Running/
Finished: 20
Job1 - 10
Containers Running
Job1
Containers: 100
Running/
Finished: 30
Job2
Containers: 20
Running/
Finished: 10
Job2 – 10
Containers Running
Job2 - 10
Containers Running
25. Page 25 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Preemption
Across Queues Supported
Within Queue across
Users
Roadmap
Within Queue within User
Not Supported
(Fair Sharing only)
Across Queues
(Node Labels)
Supported
YARN-569
26. Page 26 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Preemption: Overview
Preempt based on lowest priority
Queue that is most over subscribed
Last scheduled app in FIFO queue
Does not account for user limits within a queue
Warn and request from application
Requests application to un-reserve a container
After period will forcibly kill container
27. Page 27 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
© Hortonworks Inc. 2013
Preemption in action
27
1
Product
QUEUE
Min: 20%
Max: 30%
MarkeIng
QUEUE
Min: 45% Max: 75%
Finance
QUEUE
Job 1
Min: 35%
Max: 35%
28. Page 28 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
© Hortonworks Inc. 2013
Preemption in action
28
2
Product
QUEUE
Min: 20%
Max: 30%
MarkeIng
QUEUE
Min: 45% Max: 75%
Finance
QUEUE
Job 2
Min: 35%
Max: 35%
Job1
29. Page 29 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
© Hortonworks Inc. 2013
Preemption in action
29
3
Product
QUEUE
Min: 20%
Max: 30%
MarkeIng
QUEUE
Min: 45% Max: 75%
Finance
QUEUE
Job 3
Min: 35%
Max: 35%
Job1
Job2
65%
30. Page 30 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
© Hortonworks Inc. 2013
Preemption in action
30
4
Product
QUEUE
Min: 20%
Max: 30%
MarkeIng
QUEUE
Min: 45% Max: 75%
Finance
QUEUE
Job 3
Min: 35%
Max: 35%
Job1
Job3
Job2 preemptedJob2
31. Page 31 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
© Hortonworks Inc. 2013
Preemption in action
31
5
Product
QUEUE
Min: 20%
Max: 30%
MarkeIng
QUEUE
Min: 45% Max: 75%
Finance
QUEUE
Min: 35%
Max: 35%
Job2
Job3
Job1 finishes
32. Page 32 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Chargeback: App-Level Aggregate
Capture resources of an app.
Resources
Exposes Memory and CPU Seconds
Reserved amount not utilized
Exposed
REST API, CLI, WebUI
Ambari Chargeback View coming soon!
YARN-415
33. Page 33 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Resource Control
34. Page 34 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Memory Scheduling
What
Default method of scheduling today
Applications request containers of X memory resources MB/GB
YARN Capacity Scheduler schedules containers based on node memory availability
Why
Most tasks are not CPU Bound
Typically an abundant resource on newer clusters (256GB Per Node Standard)
Had to start with something ;)
35. Page 35 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
CPU Scheduling
What
Admin tells YARN how much CPU capacity is available in cluster
Applications specify CPU capacity needed for each container
YARN Capacity Scheduler schedules application taking CPU capacity availability into account
Why
Applications (for example Storm, HBase, Machine Learning) need predictable access to CPU
as a resource
CPU has become bottleneck instead of memory in certain clusters (128 GB RAM, 6 CPUs)
YARN-2
36. Page 36 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
CGroup Isolation for CPU
What
Admin enables CGroups for CPU Isolation for all YARN application workloads
Why
Applications need guaranteed access to CPU resources
To ensure SLAs, need to enforce CPU allocations given to an Application container
Effects
Containers ability to use CPU by vCores requested, hard or soft enforcement
NodeManager max allowed CPU usage for ALL Containers on Node (Host CPU % Total)
YARN-3
37. Page 37 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
[Coming Soon] Disk Resources
What
Just isolation – enforce equal sharing of Disk or dedication of spindles
Disk Isolation : Local Disk Iops at runtime… not HDFS read/writes
Disk Dedication : Let applications request dedicated spindles
How
Linux only – uses CGroups
Use Cgroups resource handler:
org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler
Enable Disk resource
yarn.nodemanager.resource.disk.enabled
YARN-2619
38. Page 38 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Preemption Demo
Demo
1. Multiple Queues without Preemption
2. Multiple Queues with Preemption
Simulated workload
1. Start 2 jobs and use all elasticity of the reports & ops queues
2. Wait ~30 Seconds
3. Start 2 more jobs in adhoc & batch queues
39. Page 39 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Thank You!
Questions?
40. Page 40 © Hortonworks Inc. 2011 – 2015. All Rights Reserved