Practical learnings from running thousands of Flink jobs

•Transferir como PPTX, PDF•

0 gostou•323 visualizações

Flink Forward San Francisco 2022. Task Managers constantly running out of memory? Flink job keeps restarting from cryptic Akka exceptions? Flink job running but doesn’t seem to be processing any records? We share practical learnings from running thousands of Flink Jobs for different use-cases and take a look at common challenges they have experienced such as out-of-memory errors, timeouts and job stability. We will cover memory tuning, S3 and Akka configurations to address common pitfalls and the approaches that we take on automating health monitoring and management of Flink jobs at scale. by Hong Teoh & Usamah Jassat

Tecnologia

© 2022, Amazon Web Services, Inc. or its affiliates.
© 2022, Amazon Web Services, Inc. or its affiliates.
Practical learnings from
running thousands of Flink jobs
HongTeoh, Usamah Jassat
Amazon Kinesis Data Analytics

© 2022, Amazon Web Services, Inc. or its affiliates.
Outline
• Job Stability
• State backend selection and tuning

© 2022, Amazon Web Services, Inc. or its affiliates.
Typical Flink cluster set up on Kubernetes
Jobmanager Taskmanager
Taskmanager
Taskmanager

© 2022, Amazon Web Services, Inc. or its affiliates.
Job instability: Memory
OOMKilled
Using more than allocated to Flink configuration

© 2022, Amazon Web Services, Inc. or its affiliates.
Memory management (Instance/Container/Java)
Instance (Kubernetes node) cAdvisor OS
oom-killer
Taskmanager container
Java (Flink process)
Other processes (e.g. Python, Kinesis producers)

© 2022, Amazon Web Services, Inc. or its affiliates.
Memory investigation learnings
• Record both Java and cAdvisor memory metrics
• Record both virtual + real memory use
• OOM-Killer can cause “broken” containers

© 2022, Amazon Web Services, Inc. or its affiliates.
Flink – Java memory mapping
Flink / Java Heap Metaspace Direct Mapped JNI Overhead
Heap
Metaspace
Network
Task Off-heap
Managed
Overhead
-Xmx -XX:MetaspaceSize
-XX:MaxDirectMemorySize
NOT CONTROLLED
Flink controls*
NOT CONTROLLED
Heap OOM
Metaspace
OOM
OOMKilled
Direct
Memory
OOM

© 2022, Amazon Web Services, Inc. or its affiliates.
Memory configuration learnings
• jvm-overhead too low
• Investigate native memory use using:
• Native MemoryTracking
-XX:NativeMemoryTracking=detail
jcmd 1 VM.native_memory
• jemalloc + jeprof

© 2022, Amazon Web Services, Inc. or its affiliates.
Job instability: “Hung” job / checkpoints

© 2022, Amazon Web Services, Inc. or its affiliates.
Thread dump: Thread stuck on SocketRead

© 2022, Amazon Web Services, Inc. or its affiliates.
Thread dump: Thread stuck on deadlock
jackson-databind version < 2.12

© 2022, Amazon Web Services, Inc. or its affiliates.
State backends
• State backends in Flink
• Comparing HashMap and RocksDB
• RocksDB Optimizations

© 2022, Amazon Web Services, Inc. or its affiliates.
State-backends in Flink

© 2022, Amazon Web Services, Inc. or its affiliates.
Comparing HashMap and RocksDB
HashMap RocksDB
Storage Java Heap Native Memory + Disk
Ceiling Java Heap File System
Incremental Checkpointing Yes (experimental) Yes
R/W Performance Higher Lower*
GC Impact Yes No
In Flight Data Format Java Objects Serialized

© 2022, Amazon Web Services, Inc. or its affiliates.
RockDB or HashMap?
• RocksDB for large state
• Hash Map for performance

© 2022, Amazon Web Services, Inc. or its affiliates.
RocksDB Optimization
Write Stalls
• Increase RocksDB memory
• Increase write threads
• Increase write buffer memory*
• Write buffer size + counts
Slow Reads
• Local Disk
• Increase RocksDB memory
• Increase read cache*

© 2022, Amazon Web Services, Inc. or its affiliates.
© 2022, Amazon Web Services, Inc. or its affiliates.
Thank you!

© 2022, Amazon Web Services, Inc. or its affiliates.
Q&A

Mais conteúdo relacionado

Mais procurados

Introduction to Apache Flink - Fast and reliable big data processingTill Rohrmann

Apache Flink in the Cloud-Native EraFlink Forward

Batch Processing at Scale with Flink & IcebergFlink Forward

Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward

Dynamic Rule-based Real-time Market Data AlertsFlink Forward

Apache Flink internalsKostas Tzoumas

Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward

Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...HostedbyConfluent

Producer Performance Tuning for Apache KafkaJiangjie Qin

Demystifying flink memory allocation and tuning - Roshan Naik, UberFlink Forward

CDC Stream Processing with Apache FlinkTimo Walther

“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...Flink Forward

A Deep Dive into Kafka Controllerconfluent

Tame the small files problem and optimize data layout for streaming ingestion...Flink Forward

Introduction to Apache KafkaAIMDek Technologies

The Current State of Table API in 2022Flink Forward

One sink to rule them all: Introducing the new Async SinkFlink Forward

From Message to Cluster: A Realworld Introduction to Kafka Capacity Planningconfluent

Using the New Apache Flink Kubernetes Operator in a Production DeploymentFlink Forward

Apache Flink and what it is used forAljoscha Krettek

Mais procurados (20)

Introduction to Apache Flink - Fast and reliable big data processing

Apache Flink in the Cloud-Native Era

Batch Processing at Scale with Flink & Iceberg

Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...

Dynamic Rule-based Real-time Market Data Alerts

Apache Flink internals

Exactly-Once Financial Data Processing at Scale with Flink and Pinot

Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...

Producer Performance Tuning for Apache Kafka

Demystifying flink memory allocation and tuning - Roshan Naik, Uber

CDC Stream Processing with Apache Flink

“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...

A Deep Dive into Kafka Controller

Tame the small files problem and optimize data layout for streaming ingestion...

Introduction to Apache Kafka

The Current State of Table API in 2022

One sink to rule them all: Introducing the new Async Sink

From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning

Using the New Apache Flink Kubernetes Operator in a Production Deployment

Apache Flink and what it is used for

Semelhante a Practical learnings from running thousands of Flink jobs

Amazon DocumentDB vs MongoDB 의 내부 아키텍쳐 와 장단점 비교Amazon Web Services Korea

2018 10-19-jc conf-embrace-legacy-java-ee-by-aws-serverlessKim Kao

Opinionated re:Invent recap with AWS Heroes & BuildersDaniel Zivkovic

Running your Java EE 6 Applications in the CloudArun Gupta

JFokus 2011 - Running your Java EE 6 apps in the CloudArun Gupta

AWS Community Day 2022 Dhiraj Mahapatro_AWS Lambda under the hood _ Best Prac...AWS Chicago

Getting started with Amazon ECSIoannis Polyzos

Deep dive into AWS fargateAmazon Web Services

AWS Startup Day Kyiv: Container services on AWS. Comparing Amazon ECS, AWS Fa...Amazon Web Services

AWS Container servicesAleksandr Maklakov

Running your Java EE 6 applications in the Cloud @ Silicon Valley Code Camp 2010Arun Gupta

Running your Java EE 6 Apps in the Cloud - JavaOne India 2011Arun Gupta

JavaOne India 2011 - Running your Java EE 6 Apps in the CloudArun Gupta

Running your Java EE 6 applications in the Cloud (FISL 12)Arun Gupta

Assembling an AWS CloudFormation Authoring Tool Chain (DEV368-R2) - AWS re:In...Amazon Web Services

Running your Java EE 6 applications in the CloudArun Gupta

SRV314 Containerized App Development with AWS FargateAmazon Web Services

Performance Tuning - MuraCon 2012eballisty

Working with Relational Databases in AWS Glue ETL (ANT342) - AWS re:Invent 2018Amazon Web Services

Introducing AWS FargateAmazon Web Services

Semelhante a Practical learnings from running thousands of Flink jobs (20)

Amazon DocumentDB vs MongoDB 의 내부 아키텍쳐 와 장단점 비교

2018 10-19-jc conf-embrace-legacy-java-ee-by-aws-serverless

Opinionated re:Invent recap with AWS Heroes & Builders

Running your Java EE 6 Applications in the Cloud

JFokus 2011 - Running your Java EE 6 apps in the Cloud

AWS Community Day 2022 Dhiraj Mahapatro_AWS Lambda under the hood _ Best Prac...

Getting started with Amazon ECS

Deep dive into AWS fargate

AWS Startup Day Kyiv: Container services on AWS. Comparing Amazon ECS, AWS Fa...

AWS Container services

Running your Java EE 6 applications in the Cloud @ Silicon Valley Code Camp 2010

Running your Java EE 6 Apps in the Cloud - JavaOne India 2011

JavaOne India 2011 - Running your Java EE 6 Apps in the Cloud

Running your Java EE 6 applications in the Cloud (FISL 12)

Assembling an AWS CloudFormation Authoring Tool Chain (DEV368-R2) - AWS re:In...

Running your Java EE 6 applications in the Cloud

SRV314 Containerized App Development with AWS Fargate

Performance Tuning - MuraCon 2012

Working with Relational Databases in AWS Glue ETL (ANT342) - AWS re:Invent 2018

Introducing AWS Fargate

Mais de Flink Forward

Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward

Tuning Apache Kafka Connectors for Flink.pptxFlink Forward

Flink SQL on Pulsar made easyFlink Forward

Processing Semantically-Ordered Streams in Financial ServicesFlink Forward

Welcome to the Flink Community!Flink Forward

Extending Flink SQL for stream processing use casesFlink Forward

The top 3 challenges running multi-tenant Flink at scaleFlink Forward

Changelog Stream Processing with Apache FlinkFlink Forward

Large Scale Real Time Fraudulent Web Behavior DetectionFlink Forward

Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Flink Forward

Building Reliable Lakehouses with Apache Flink and Delta LakeFlink Forward

Near real-time statistical modeling and anomaly detection using Flink!Flink Forward

How to build a streaming Lakehouse with Flink, Kafka, and HudiFlink Forward

Mais de Flink Forward (13)

Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...

Tuning Apache Kafka Connectors for Flink.pptx

Flink SQL on Pulsar made easy

Processing Semantically-Ordered Streams in Financial Services

Welcome to the Flink Community!

Extending Flink SQL for stream processing use cases

The top 3 challenges running multi-tenant Flink at scale

Changelog Stream Processing with Apache Flink

Large Scale Real Time Fraudulent Web Behavior Detection

Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...

Building Reliable Lakehouses with Apache Flink and Delta Lake

Near real-time statistical modeling and anomaly detection using Flink!

How to build a streaming Lakehouse with Flink, Kafka, and Hudi

Último

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz

Ransomware_Q4_2023. The report. [EN].pdfOverkill Security

Architecting Cloud Native ApplicationsWSO2

Why Teams call analytics are critical to your entire businesspanagenda

Apidays New York 2024 - The value of a flexible API Management solution for O...apidays

TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez

MS Copilot expands with MS Graph connectorsNanddeep Nachan

AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software

CNIC Information System with Pakdata Cf In Pakistandanishmna97

Manulife - Insurer Transformation Award 2024The Digital Insurer

DBX First Quarter 2024 Investor PresentationDropbox

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub

Practical learnings from running thousands of Flink jobs

1. © 2022, Amazon Web Services, Inc. or its affiliates. © 2022, Amazon Web Services, Inc. or its affiliates. Practical learnings from running thousands of Flink jobs HongTeoh, Usamah Jassat Amazon Kinesis Data Analytics

5. © 2022, Amazon Web Services, Inc. or its affiliates. Memory management (Instance/Container/Java) Instance (Kubernetes node) cAdvisor OS oom-killer Taskmanager container Java (Flink process) Other processes (e.g. Python, Kinesis producers)

6. © 2022, Amazon Web Services, Inc. or its affiliates. Memory investigation learnings • Record both Java and cAdvisor memory metrics • Record both virtual + real memory use • OOM-Killer can cause “broken” containers

7. © 2022, Amazon Web Services, Inc. or its affiliates. Flink – Java memory mapping Flink / Java Heap Metaspace Direct Mapped JNI Overhead Heap Metaspace Network Task Off-heap Managed Overhead -Xmx -XX:MetaspaceSize -XX:MaxDirectMemorySize NOT CONTROLLED Flink controls* NOT CONTROLLED Heap OOM Metaspace OOM OOMKilled Direct Memory OOM

8. © 2022, Amazon Web Services, Inc. or its affiliates. Memory configuration learnings • jvm-overhead too low • Investigate native memory use using: • Native MemoryTracking -XX:NativeMemoryTracking=detail jcmd 1 VM.native_memory • jemalloc + jeprof

14. © 2022, Amazon Web Services, Inc. or its affiliates. Comparing HashMap and RocksDB HashMap RocksDB Storage Java Heap Native Memory + Disk Ceiling Java Heap File System Incremental Checkpointing Yes (experimental) Yes R/W Performance Higher Lower* GC Impact Yes No In Flight Data Format Java Objects Serialized

16. © 2022, Amazon Web Services, Inc. or its affiliates. RocksDB Optimization Write Stalls • Increase RocksDB memory • Increase write threads • Increase write buffer memory* • Write buffer size + counts Slow Reads • Local Disk • Increase RocksDB memory • Increase read cache*

Notas do Editor

In a typical setup of Flink cluster on Kubernetes JM containers TM containers
Instance  2 taskmanager containers -> Java process, Python process, Kinesis producers Each measures memory differently: Java process measures heap, non-heap using beans. Only measure memory use of Java process, excludes Python/Kinesis Containers’ memory use is measured using cAdvisor On the instance, the OS bridges the gap between Virtual memory and Real memory Means that if the Java process requests 100 Gb of memory, that memory is allocated in virtual space In reality, it is mapped to a “zero page”, meaning it doesn’t actually take up 100Gb of space, until it is accessed/written to. In Kubernetes set ups, vm.overcommit_memory is recommended to be set to 1 -> Do not limit virtual memory. When all processes use the memory, the oom-killer (OS process) will spin up and terminate a process. Does not care what container it belongs to – can cause containers to be in unhealthy state. Takeaways Measure BOTH Java memory and OS memory (cAdvisor) Use data to drill down into which bucket of memory is over the limit cGroups v2 to treat all processes in container like a single process
Java memory NOT JUST HEAP Metaspace – for classes Direct – directly use ByteBuffers -> significantly faster than allocating in heap Mapped – useful when reading from file -> Map a file into a ByteBuffer Java Native Interface (JNI) – when running non-Java code (RocksDB state backend) Overhead – used for the JVM itself (GC, thread stacks, Symbols) Map to Flink Taskmanager memory configurations Explain limits for each bucket When each limit is exceeded, this is how it surfaces Gotchas

Practical learnings from running thousands of Flink jobs

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Practical learnings from running thousands of Flink jobs

Semelhante a Practical learnings from running thousands of Flink jobs (20)

Mais de Flink Forward

Mais de Flink Forward (13)

Último

Último (20)

Practical learnings from running thousands of Flink jobs

Notas do Editor