SlideShare uma empresa Scribd logo
1 de 17
Autoscaling with
Apache Flink
Robert Metzger
Staff Engineer @ decodable, Committer and PMC Chair @ Flink
Why Autoscaling?
Source: https://flink.apache.org/2021/05/06/reactive-mode.html
Wasted resources
Reasons for changing loads
- Seasonality:
- day / night
- weekend / weekday
- Product popularity: new feature launches, ad campaigns
- Upstream system outages: load spikes during recovery
Solutions in Flink to Rescale
- Flink 1.2 (2017): Rescalable State
- Flink can restore from a savepoint with a different parallelism, so no data will be lost, all
computations will stay correct
- When used for scaling: requires custom tooling to orchestrate operations, and
bookkeeping
- Flink 1.13 (2021): Reactive Mode (beta)
- Flink automatically adjusts when TaskManagers are added or removed
- Requires outside entity to decide on # TaskManagers
- Since Flink 1.15 (2022): Reactive Mode is out of beta
Further reading: https://flink.apache.org/features/2017/07/04/flink-rescalable-state.html
How to use Reactive Mode?
- Reactive Mode works with all standalone deployments
- E.g. Kubernetes, Docker or via the provided deployment scripts
- Set the configuration:
scheduler-mode=reactive
- Start the JobManager, and add as many TaskManagers as you need
- (optionally) Use a service to determine the number of TaskManagers
- Kubernetes Horizontal Pod Autoscaler
- AWS AutoScaling Groups
- Google Cloud Managed Instance Groups
Reactive Mode: How does it work?
JobManager
TaskManager
Job parallelism = 2
TaskManager
Flink automatically adjusts when TaskManagers are added or removed
Example: Load is increasing
Load
Reactive Mode: How does it work?
JobManager
TaskManager
Job parallelism = 4
TaskManager
Flink automatically adjusts when TaskManagers are added or removed
Example: Load is increasing → add more TaskManagers
TaskManager TaskManager
NEW NEW
Reactive Mode: How does it work?
- The JobManager adjusts the job parallelism depending on the number of
available TaskManagers
- When the # TaskManager changes, the Flink job is restarting, restoring from
the latest checkpoint
- Possible metrics: CPU load / Kafka lag (recommended) / Throughput / latency
- Scaling model similar to Kafka Streams
Reactive Mode example: Kubernetes HPA
- Kubernetes has a built-in
component called
HorizontalPodAutoscaler
- Automatically adjusts the
scale of a deployment based
on a metric
Flink
TaskManager
Deployment
Flink
JobManager
Job
Flink
Job-
Manager
Pod
Flink
Task-
Manager
Pod
Flink
Task-
Manager
Pod
Flink
Task-
Manager
Pod
min=1 max=15
cpu=80%
on=TaskManager
deployment
Horizontalpodautoscaler
Adjusted dynamically
Source: https://flink.apache.org/2021/05/06/reactive-mode.html
Reactive Mode and Flink Deployments
→ Reactive Mode only works with “standalone mode”
Passive Deployment
Flink resources managed externally (“Standalone
mode”)
→ “a bunch of JVMs”
Deployed on bare metal, Docker, Kubernetes
Pros / Cons:
+ DIY scenarios
+ Fast deployments
- Restart
→ Reactive Scaling (outside entity decides)
Active Deployment
Flink actively manages resources
→ Flink talks to a resource manager
Implementations: Native Kubernetes, YARN
Pros / cons:
+ Automatically restarts failed resources
+ Allocates only required resources
- Requires a lot of K8s permissions
→ Autoscaling (Flink decides)
Autoscaling with Flink? Enter Adaptive
Scheduler
- Benefits
- Flink can make better scaling decisions
- Example: rescale only right after a checkpoint completed → avoid
reprocessing
- Fewer components required (“batteries included”)
- How?
- Reactive Mode is based a new (Flink 1.13) internal workload scheduler,
called Adaptive Scheduler.
- Currently configured to behave “reactively”, can also be changed to
automatic
Internals: Adaptive Scheduler
Source / Further reading: https://cwiki.apache.org/confluence/display/FLINK/FLIP-160%3A+Adaptive+Scheduler
https://cwiki.apache.org/confluence/display/FLINK/FLIP-138%3A+Declarative+Resource+management
SlotManager
Resource
Manager
Active K8s / YARN
Requirements
Adaptive Scheduler
I need 15 slots
I have 8 slots
Adaptive Scheduler for Autoscaling (future)
Source / Further reading: https://cwiki.apache.org/confluence/display/FLINK/FLIP-160%3A+Adaptive+Scheduler
https://cwiki.apache.org/confluence/display/FLINK/FLIP-138%3A+Declarative+Resource+management
SlotManager
Resource
Manager
Active K8s / YARN
Requirements
Adaptive Scheduler
I need x slots
I have 8 slots
Pluggable
Autoscaler
Ideas for autoscaler implementations
- REST Interface
- Set desired parallelism via REST call to JobManager
- Either for entire job (and let JM decide on per-operator parallelism) or per-
operator
- User Code + provided autoscaling strategies
- User provides Flink with a custom scaling logic with access to metrics
- Problem: we want to avoid user-code on the JobManager
- JobGraph configuration
- Users configure min, target, max parallelism per operator
Closing remarks
- Autoscaling with Flink is possible today, it’s called
“Reactive Mode” :-)
- Getting started guide:
https://flink.apache.org/2021/05/06/reactive-mode.html
- Limitations of Adaptive Scheduler / Reactive Mode
- Only works with Application Mode
- Task local recovery not yet supported
- Lack of good UI support (history of rescale events)
Questions?
rmetzger@decodable.co / rmetzger@apache.org
@rmetzger_
2022
Build real-time data apps &
services. Fast.
decodable.co

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async Sink
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache Flink
 
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and Profit
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scale
 
Unified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache FlinkUnified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache Flink
 

Semelhante a Autoscaling Flink with Reactive Mode

Semelhante a Autoscaling Flink with Reactive Mode (20)

Php Conference Brazil - Phalcon Giant Killer
Php Conference Brazil - Phalcon Giant KillerPhp Conference Brazil - Phalcon Giant Killer
Php Conference Brazil - Phalcon Giant Killer
 
Airflow at lyft
Airflow at lyftAirflow at lyft
Airflow at lyft
 
Apache Flink Hands On
Apache Flink Hands OnApache Flink Hands On
Apache Flink Hands On
 
SAP HANA 2 – Upgrade and Operations Part 1 - Exploring Features of the New Co...
SAP HANA 2 – Upgrade and Operations Part 1 - Exploring Features of the New Co...SAP HANA 2 – Upgrade and Operations Part 1 - Exploring Features of the New Co...
SAP HANA 2 – Upgrade and Operations Part 1 - Exploring Features of the New Co...
 
Phalcon 2 - PHP Brazil Conference
Phalcon 2 - PHP Brazil ConferencePhalcon 2 - PHP Brazil Conference
Phalcon 2 - PHP Brazil Conference
 
Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...
 
ApacheCon NA - Apache Camel K: a cloud-native integration platform
ApacheCon NA - Apache Camel K: a cloud-native integration platformApacheCon NA - Apache Camel K: a cloud-native integration platform
ApacheCon NA - Apache Camel K: a cloud-native integration platform
 
Lecture05.pptx
Lecture05.pptxLecture05.pptx
Lecture05.pptx
 
PHP Conference - Phalcon hands-on
PHP Conference - Phalcon hands-onPHP Conference - Phalcon hands-on
PHP Conference - Phalcon hands-on
 
Coprocessors - Uses, Abuses, Solutions - presented at HBaseCon East 2016
Coprocessors - Uses, Abuses, Solutions - presented at HBaseCon East 2016Coprocessors - Uses, Abuses, Solutions - presented at HBaseCon East 2016
Coprocessors - Uses, Abuses, Solutions - presented at HBaseCon East 2016
 
Optimized Hive replication
Optimized Hive replicationOptimized Hive replication
Optimized Hive replication
 
SAP HANA SPS12 Exploring New Features
SAP HANA SPS12 Exploring New FeaturesSAP HANA SPS12 Exploring New Features
SAP HANA SPS12 Exploring New Features
 
Scale Apache with Nginx
Scale Apache with NginxScale Apache with Nginx
Scale Apache with Nginx
 
Phalcon - Giant Killer
Phalcon - Giant KillerPhalcon - Giant Killer
Phalcon - Giant Killer
 
20151229 wnmp & phalcon micro app - part I
20151229 wnmp & phalcon micro app - part I20151229 wnmp & phalcon micro app - part I
20151229 wnmp & phalcon micro app - part I
 
Airflow 101
Airflow 101Airflow 101
Airflow 101
 
HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions
HBaseConEast2016: Coprocessors – Uses, Abuses and SolutionsHBaseConEast2016: Coprocessors – Uses, Abuses and Solutions
HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions
 
Flink at netflix paypal speaker series
Flink at netflix   paypal speaker seriesFlink at netflix   paypal speaker series
Flink at netflix paypal speaker series
 
Magento scalability from the trenches (Meet Magento Sweden 2016)
Magento scalability from the trenches (Meet Magento Sweden 2016)Magento scalability from the trenches (Meet Magento Sweden 2016)
Magento scalability from the trenches (Meet Magento Sweden 2016)
 
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsRunning Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
 

Mais de Flink Forward

Mais de Flink Forward (12)

The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easy
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
 
Welcome to the Flink Community!
Welcome to the Flink Community!Welcome to the Flink Community!
Welcome to the Flink Community!
 
Extending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use casesExtending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use cases
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache Flink
 
Large Scale Real Time Fraudulent Web Behavior Detection
Large Scale Real Time Fraudulent Web Behavior DetectionLarge Scale Real Time Fraudulent Web Behavior Detection
Large Scale Real Time Fraudulent Web Behavior Detection
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta Lake
 
Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

Autoscaling Flink with Reactive Mode

  • 1. Autoscaling with Apache Flink Robert Metzger Staff Engineer @ decodable, Committer and PMC Chair @ Flink
  • 3. Reasons for changing loads - Seasonality: - day / night - weekend / weekday - Product popularity: new feature launches, ad campaigns - Upstream system outages: load spikes during recovery
  • 4. Solutions in Flink to Rescale - Flink 1.2 (2017): Rescalable State - Flink can restore from a savepoint with a different parallelism, so no data will be lost, all computations will stay correct - When used for scaling: requires custom tooling to orchestrate operations, and bookkeeping - Flink 1.13 (2021): Reactive Mode (beta) - Flink automatically adjusts when TaskManagers are added or removed - Requires outside entity to decide on # TaskManagers - Since Flink 1.15 (2022): Reactive Mode is out of beta Further reading: https://flink.apache.org/features/2017/07/04/flink-rescalable-state.html
  • 5. How to use Reactive Mode? - Reactive Mode works with all standalone deployments - E.g. Kubernetes, Docker or via the provided deployment scripts - Set the configuration: scheduler-mode=reactive - Start the JobManager, and add as many TaskManagers as you need - (optionally) Use a service to determine the number of TaskManagers - Kubernetes Horizontal Pod Autoscaler - AWS AutoScaling Groups - Google Cloud Managed Instance Groups
  • 6. Reactive Mode: How does it work? JobManager TaskManager Job parallelism = 2 TaskManager Flink automatically adjusts when TaskManagers are added or removed Example: Load is increasing Load
  • 7. Reactive Mode: How does it work? JobManager TaskManager Job parallelism = 4 TaskManager Flink automatically adjusts when TaskManagers are added or removed Example: Load is increasing → add more TaskManagers TaskManager TaskManager NEW NEW
  • 8. Reactive Mode: How does it work? - The JobManager adjusts the job parallelism depending on the number of available TaskManagers - When the # TaskManager changes, the Flink job is restarting, restoring from the latest checkpoint - Possible metrics: CPU load / Kafka lag (recommended) / Throughput / latency - Scaling model similar to Kafka Streams
  • 9. Reactive Mode example: Kubernetes HPA - Kubernetes has a built-in component called HorizontalPodAutoscaler - Automatically adjusts the scale of a deployment based on a metric Flink TaskManager Deployment Flink JobManager Job Flink Job- Manager Pod Flink Task- Manager Pod Flink Task- Manager Pod Flink Task- Manager Pod min=1 max=15 cpu=80% on=TaskManager deployment Horizontalpodautoscaler Adjusted dynamically Source: https://flink.apache.org/2021/05/06/reactive-mode.html
  • 10. Reactive Mode and Flink Deployments → Reactive Mode only works with “standalone mode” Passive Deployment Flink resources managed externally (“Standalone mode”) → “a bunch of JVMs” Deployed on bare metal, Docker, Kubernetes Pros / Cons: + DIY scenarios + Fast deployments - Restart → Reactive Scaling (outside entity decides) Active Deployment Flink actively manages resources → Flink talks to a resource manager Implementations: Native Kubernetes, YARN Pros / cons: + Automatically restarts failed resources + Allocates only required resources - Requires a lot of K8s permissions → Autoscaling (Flink decides)
  • 11. Autoscaling with Flink? Enter Adaptive Scheduler - Benefits - Flink can make better scaling decisions - Example: rescale only right after a checkpoint completed → avoid reprocessing - Fewer components required (“batteries included”) - How? - Reactive Mode is based a new (Flink 1.13) internal workload scheduler, called Adaptive Scheduler. - Currently configured to behave “reactively”, can also be changed to automatic
  • 12. Internals: Adaptive Scheduler Source / Further reading: https://cwiki.apache.org/confluence/display/FLINK/FLIP-160%3A+Adaptive+Scheduler https://cwiki.apache.org/confluence/display/FLINK/FLIP-138%3A+Declarative+Resource+management SlotManager Resource Manager Active K8s / YARN Requirements Adaptive Scheduler I need 15 slots I have 8 slots
  • 13. Adaptive Scheduler for Autoscaling (future) Source / Further reading: https://cwiki.apache.org/confluence/display/FLINK/FLIP-160%3A+Adaptive+Scheduler https://cwiki.apache.org/confluence/display/FLINK/FLIP-138%3A+Declarative+Resource+management SlotManager Resource Manager Active K8s / YARN Requirements Adaptive Scheduler I need x slots I have 8 slots Pluggable Autoscaler
  • 14. Ideas for autoscaler implementations - REST Interface - Set desired parallelism via REST call to JobManager - Either for entire job (and let JM decide on per-operator parallelism) or per- operator - User Code + provided autoscaling strategies - User provides Flink with a custom scaling logic with access to metrics - Problem: we want to avoid user-code on the JobManager - JobGraph configuration - Users configure min, target, max parallelism per operator
  • 15. Closing remarks - Autoscaling with Flink is possible today, it’s called “Reactive Mode” :-) - Getting started guide: https://flink.apache.org/2021/05/06/reactive-mode.html - Limitations of Adaptive Scheduler / Reactive Mode - Only works with Application Mode - Task local recovery not yet supported - Lack of good UI support (history of rescale events)
  • 17. 2022 Build real-time data apps & services. Fast. decodable.co

Notas do Editor

  1. Space between actual load and # of workers == wasted resources You want your resource allocation to be close to actual load
  2. Rescalable state: stop with savepoint, restore Good when scaling manually and very rarely Reactive Mode == Kafka Streams deployment model
  3. Rescalable state: stop with savepoint, restore Good when scaling manually and very rarely Reactive Mode == Kafka Streams deployment model
  4. How does Reactive Mode work?
  5. “Just add more hardware”
  6. Rescaling same operation as failure: restore from latest checkpoint Can be expensive with large state … only rescale rarely!
  7. Example implementation in Kubernetes, the most popular deployment option of Flink at the moment
  8. Relationship of scaling and deployment modes. Passive deployment: manually launch the flink components (K8s HA also works here!) Active deployment: flink takes care of launch itself (mostly)
  9. Blue line / states: interesting path Source code: hide empty description skinparam monochrome false skinparam defaultFontSize 15 [*] -> Created Created --> Waiting : Start scheduling state "Waiting for resources" as Waiting #lightblue state Executing #lightblue state Restarting #lightblue Waiting --> Waiting : Resources are not stable yet Waiting -[#blue,bold]-> Executing : Resources are stable Waiting --> Finished : Cancel, suspend or not \nenough resources Executing --> Canceling : Cancel Executing --> Failing : Unrecoverable fault Executing --> Finished : Suspend terminal state Executing -[#blue,bold]-> Restarting : Recoverable fault Restarting --> Finished : Suspend Restarting --> Canceling : Cancel Restarting -[#blue,bold]-> Waiting : Cancelation complete Canceling --> Finished : Cancelation complete Failing --> Finished : Failing complete Finished -> [*] https://www.planttext.com/?text=RPB1RiCW38RlF8NLOxM-m0wxLEi3h9fsw7PmYTim4OZ0JEtRpoHbB2YdHFYp_zy_zAOZe67aEtGKTJ0Z6--KEcs_OFS2-q38rAd75tPoze66ZRl2CnmP0qFKFNN9of6AB1Hi2d7n0G95duAck06CfLSLOZdlhR20WS1vcSrujWHtuaNBwurqMcsQ6nRmmJWJnQAmUtIQx1F454To7OY_h4BEfsiFd-xFx6ITYeggUddWF6LMd_yRu83cKNwNaTh_K9ZMk62otBBLtR6w-lPdIGvpii0K1kFGmfHkqoxRvqieKRHQ_yhhOYsnibj3rEkQwvWV36W_Z9R4NXsmcdr3bwGQjXnNhjI4awVv2m00
  10. Source code: hide empty description skinparam monochrome false skinparam defaultFontSize 15 [*] -> Created Created --> Waiting : Start scheduling state "Waiting for resources" as Waiting #lightblue state Executing #lightblue state Restarting #lightblue Waiting --> Waiting : Resources are not stable yet Waiting -[#blue,bold]-> Executing : Resources are stable Waiting --> Finished : Cancel, suspend or not \nenough resources Executing --> Canceling : Cancel Executing --> Failing : Unrecoverable fault Executing --> Finished : Suspend terminal state Executing -[#blue,bold]-> Restarting : Recoverable fault Restarting --> Finished : Suspend Restarting --> Canceling : Cancel Restarting -[#blue,bold]-> Waiting : Cancelation complete Canceling --> Finished : Cancelation complete Failing --> Finished : Failing complete Finished -> [*] https://www.planttext.com/?text=RPB1RiCW38RlF8NLOxM-m0wxLEi3h9fsw7PmYTim4OZ0JEtRpoHbB2YdHFYp_zy_zAOZe67aEtGKTJ0Z6--KEcs_OFS2-q38rAd75tPoze66ZRl2CnmP0qFKFNN9of6AB1Hi2d7n0G95duAck06CfLSLOZdlhR20WS1vcSrujWHtuaNBwurqMcsQ6nRmmJWJnQAmUtIQx1F454To7OY_h4BEfsiFd-xFx6ITYeggUddWF6LMd_yRu83cKNwNaTh_K9ZMk62otBBLtR6w-lPdIGvpii0K1kFGmfHkqoxRvqieKRHQ_yhhOYsnibj3rEkQwvWV36W_Z9R4NXsmcdr3bwGQjXnNhjI4awVv2m00