SlideShare uma empresa Scribd logo
1 de 13
Apache Spark
Internals
Sandeep Purohit
Software Consultant
Knoldus Software LLP
Agenda
● Architecture of spark cluster
● Tasks, Stages, Jobs
● DAG
● Execution Workflow
● Demo
Architecture of Spark Cluster
Driver Program
(SparkContext)
Cluster Manager
Executer
Executer
Executer
Executer
Worker node
Worker node
Master node Standalone
Yarn
Mesos
Executer
● Master Node: Master node is the node on which the driver
program will be running i.e. the main() method of the application
will be running.
● Worker Node: Worker node have an executor and cache which
is responsible for running task.
● Executer: Executers are one responsible for running the tasks
and also provide the memory to store RDD's.
● Driver Program: Driver program is responsible for 2
duties:creating tasks, scheduling tasks.
● Cluster manager: Cluster manager is responsible for monitoring
the cluster and providing the resources to the executors.
Tasks, Stages, Jobs
● Tasks: smallest individual unit of execution that
represents a partition in a dataset.
Partition 1
Partition 2
Partition 3
RDD Stage
Task 1
Task 2
Task 3
● Stages: stages is the collection of tasks,
whenever the shuffle will be happen the next
task will be in different stage.
Any transformation which create
shuffleRDD
Stage 1
Stage 2
Any transformation or
action
● Jobs: jobs are the action which is submitted to
the DAGScheduler by the spark driver to run
task using the RDD lineag graph.
RDD DAGScheduler executor
DAG
● Spark Schedular create the DAG of the stages
to send the DAG object to workers to evaluate
the final result.
map
filter
count
repartition
Execution workflow
RDD object
(Create DAG) DAG Schedular
(Split graph into
stages)
Cluster
Manager
worker
DAG Taskset Run
Task
Execution Workflow
Demo
Q&A
Thanks

Mais conteúdo relacionado

Mais procurados

Beneath RDD in Apache Spark by Jacek Laskowski
Beneath RDD in Apache Spark by Jacek LaskowskiBeneath RDD in Apache Spark by Jacek Laskowski
Beneath RDD in Apache Spark by Jacek LaskowskiSpark Summit
 
Apache Spark RDDs
Apache Spark RDDsApache Spark RDDs
Apache Spark RDDsDean Chen
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to sparkDuyhai Doan
 
Spark overview
Spark overviewSpark overview
Spark overviewLisa Hua
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introductioncolorant
 
Spark Study Notes
Spark Study NotesSpark Study Notes
Spark Study NotesRichard Kuo
 
Transformations and actions a visual guide training
Transformations and actions a visual guide trainingTransformations and actions a visual guide training
Transformations and actions a visual guide trainingSpark Summit
 
Intro to apache spark stand ford
Intro to apache spark stand fordIntro to apache spark stand ford
Intro to apache spark stand fordThu Hiền
 
How Apache Spark fits into the Big Data landscape
How Apache Spark fits into the Big Data landscapeHow Apache Spark fits into the Big Data landscape
How Apache Spark fits into the Big Data landscapePaco Nathan
 
Apache Spark Introduction - CloudxLab
Apache Spark Introduction - CloudxLabApache Spark Introduction - CloudxLab
Apache Spark Introduction - CloudxLabAbhinav Singh
 
Top 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationshadooparchbook
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkPatrick Wendell
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekVenkata Naga Ravi
 
DTCC '14 Spark Runtime Internals
DTCC '14 Spark Runtime InternalsDTCC '14 Spark Runtime Internals
DTCC '14 Spark Runtime InternalsCheng Lian
 
Spark and spark streaming internals
Spark and spark streaming internalsSpark and spark streaming internals
Spark and spark streaming internalsSigmoid
 

Mais procurados (20)

Beneath RDD in Apache Spark by Jacek Laskowski
Beneath RDD in Apache Spark by Jacek LaskowskiBeneath RDD in Apache Spark by Jacek Laskowski
Beneath RDD in Apache Spark by Jacek Laskowski
 
Apache Spark RDDs
Apache Spark RDDsApache Spark RDDs
Apache Spark RDDs
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to spark
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
 
Spark overview
Spark overviewSpark overview
Spark overview
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Spark Study Notes
Spark Study NotesSpark Study Notes
Spark Study Notes
 
Apache Spark RDD 101
Apache Spark RDD 101Apache Spark RDD 101
Apache Spark RDD 101
 
Spark on YARN
Spark on YARNSpark on YARN
Spark on YARN
 
Transformations and actions a visual guide training
Transformations and actions a visual guide trainingTransformations and actions a visual guide training
Transformations and actions a visual guide training
 
Intro to apache spark stand ford
Intro to apache spark stand fordIntro to apache spark stand ford
Intro to apache spark stand ford
 
How Apache Spark fits into the Big Data landscape
How Apache Spark fits into the Big Data landscapeHow Apache Spark fits into the Big Data landscape
How Apache Spark fits into the Big Data landscape
 
Apache Spark Introduction - CloudxLab
Apache Spark Introduction - CloudxLabApache Spark Introduction - CloudxLab
Apache Spark Introduction - CloudxLab
 
Spark core
Spark coreSpark core
Spark core
 
Top 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applications
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache Spark
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
 
DTCC '14 Spark Runtime Internals
DTCC '14 Spark Runtime InternalsDTCC '14 Spark Runtime Internals
DTCC '14 Spark Runtime Internals
 
Spark and spark streaming internals
Spark and spark streaming internalsSpark and spark streaming internals
Spark and spark streaming internals
 

Destaque

Walk-through: Amazon ECS
Walk-through: Amazon ECSWalk-through: Amazon ECS
Walk-through: Amazon ECSKnoldus Inc.
 
Akka Finite State Machine
Akka Finite State MachineAkka Finite State Machine
Akka Finite State MachineKnoldus Inc.
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architectureSohil Jain
 
Getting Started With AureliaJs
Getting Started With AureliaJsGetting Started With AureliaJs
Getting Started With AureliaJsKnoldus Inc.
 
Introduction to Scala JS
Introduction to Scala JSIntroduction to Scala JS
Introduction to Scala JSKnoldus Inc.
 
Drilling the Async Library
Drilling the Async LibraryDrilling the Async Library
Drilling the Async LibraryKnoldus Inc.
 
String interpolation
String interpolationString interpolation
String interpolationKnoldus Inc.
 
Mailchimp and Mandrill - The ‘Hominidae’ kingdom
Mailchimp and Mandrill - The ‘Hominidae’ kingdomMailchimp and Mandrill - The ‘Hominidae’ kingdom
Mailchimp and Mandrill - The ‘Hominidae’ kingdomKnoldus Inc.
 
Realm Mobile Database - An Introduction
Realm Mobile Database - An IntroductionRealm Mobile Database - An Introduction
Realm Mobile Database - An IntroductionKnoldus Inc.
 
Shapeless- Generic programming for Scala
Shapeless- Generic programming for ScalaShapeless- Generic programming for Scala
Shapeless- Generic programming for ScalaKnoldus Inc.
 
An Introduction to Quill
An Introduction to QuillAn Introduction to Quill
An Introduction to QuillKnoldus Inc.
 
Introduction to Scala Macros
Introduction to Scala MacrosIntroduction to Scala Macros
Introduction to Scala MacrosKnoldus Inc.
 
Introduction to Java 8
Introduction to Java 8Introduction to Java 8
Introduction to Java 8Knoldus Inc.
 
Mandrill Templates
Mandrill TemplatesMandrill Templates
Mandrill TemplatesKnoldus Inc.
 
ANTLR4 and its testing
ANTLR4 and its testingANTLR4 and its testing
ANTLR4 and its testingKnoldus Inc.
 
Introduction to Knockout Js
Introduction to Knockout JsIntroduction to Knockout Js
Introduction to Knockout JsKnoldus Inc.
 

Destaque (20)

Walk-through: Amazon ECS
Walk-through: Amazon ECSWalk-through: Amazon ECS
Walk-through: Amazon ECS
 
BDD with Cucumber
BDD with CucumberBDD with Cucumber
BDD with Cucumber
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
Akka Finite State Machine
Akka Finite State MachineAkka Finite State Machine
Akka Finite State Machine
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
 
Getting Started With AureliaJs
Getting Started With AureliaJsGetting Started With AureliaJs
Getting Started With AureliaJs
 
Introduction to Scala JS
Introduction to Scala JSIntroduction to Scala JS
Introduction to Scala JS
 
Drilling the Async Library
Drilling the Async LibraryDrilling the Async Library
Drilling the Async Library
 
Akka streams
Akka streamsAkka streams
Akka streams
 
String interpolation
String interpolationString interpolation
String interpolation
 
Mailchimp and Mandrill - The ‘Hominidae’ kingdom
Mailchimp and Mandrill - The ‘Hominidae’ kingdomMailchimp and Mandrill - The ‘Hominidae’ kingdom
Mailchimp and Mandrill - The ‘Hominidae’ kingdom
 
Realm Mobile Database - An Introduction
Realm Mobile Database - An IntroductionRealm Mobile Database - An Introduction
Realm Mobile Database - An Introduction
 
Kanban
KanbanKanban
Kanban
 
Shapeless- Generic programming for Scala
Shapeless- Generic programming for ScalaShapeless- Generic programming for Scala
Shapeless- Generic programming for Scala
 
An Introduction to Quill
An Introduction to QuillAn Introduction to Quill
An Introduction to Quill
 
Introduction to Scala Macros
Introduction to Scala MacrosIntroduction to Scala Macros
Introduction to Scala Macros
 
Introduction to Java 8
Introduction to Java 8Introduction to Java 8
Introduction to Java 8
 
Mandrill Templates
Mandrill TemplatesMandrill Templates
Mandrill Templates
 
ANTLR4 and its testing
ANTLR4 and its testingANTLR4 and its testing
ANTLR4 and its testing
 
Introduction to Knockout Js
Introduction to Knockout JsIntroduction to Knockout Js
Introduction to Knockout Js
 

Semelhante a Internals

Apache Spark - Running on a Cluster | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Running on a Cluster | Big Data Hadoop Spark Tutorial | CloudxLabApache Spark - Running on a Cluster | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Running on a Cluster | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
 
Building Distributed Systems from Scratch - Part 1
Building Distributed Systems from Scratch - Part 1Building Distributed Systems from Scratch - Part 1
Building Distributed Systems from Scratch - Part 1datamantra
 
Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
 Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F... Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...Databricks
 
Data Engineer's Lunch #80: Apache Spark Resource Managers
Data Engineer's Lunch #80: Apache Spark Resource ManagersData Engineer's Lunch #80: Apache Spark Resource Managers
Data Engineer's Lunch #80: Apache Spark Resource ManagersAnant Corporation
 
SCALABLE MONITORING USING PROMETHEUS WITH APACHE SPARK
SCALABLE MONITORING USING PROMETHEUS WITH APACHE SPARKSCALABLE MONITORING USING PROMETHEUS WITH APACHE SPARK
SCALABLE MONITORING USING PROMETHEUS WITH APACHE SPARKzmhassan
 
Core Services behind Spark Job Execution
Core Services behind Spark Job ExecutionCore Services behind Spark Job Execution
Core Services behind Spark Job Executiondatamantra
 
Apache spark - Installation
Apache spark - InstallationApache spark - Installation
Apache spark - InstallationMartin Zapletal
 
Advanced task management with Celery
Advanced task management with CeleryAdvanced task management with Celery
Advanced task management with CeleryMahendra M
 
An Introduction to Apache Spark
An Introduction to Apache SparkAn Introduction to Apache Spark
An Introduction to Apache SparkDona Mary Philip
 
Apache Mesos: a simple explanation of basics
Apache Mesos: a simple explanation of basicsApache Mesos: a simple explanation of basics
Apache Mesos: a simple explanation of basicsGladson Manuel
 
Spark Driven Big Data Analytics
Spark Driven Big Data AnalyticsSpark Driven Big Data Analytics
Spark Driven Big Data Analyticsinoshg
 
Jab12 - Joomla! architecture revealed
Jab12 - Joomla! architecture revealedJab12 - Joomla! architecture revealed
Jab12 - Joomla! architecture revealedOfer Cohen
 
Building distributed processing system from scratch - Part 2
Building distributed processing system from scratch - Part 2Building distributed processing system from scratch - Part 2
Building distributed processing system from scratch - Part 2datamantra
 
Summer Internship Project - Remote Render
Summer Internship Project - Remote RenderSummer Internship Project - Remote Render
Summer Internship Project - Remote RenderYen-Kuan Wu
 
What is Distributed Computing, Why we use Apache Spark
What is Distributed Computing, Why we use Apache SparkWhat is Distributed Computing, Why we use Apache Spark
What is Distributed Computing, Why we use Apache SparkAndy Petrella
 
Apache Spark e AWS Glue
Apache Spark e AWS GlueApache Spark e AWS Glue
Apache Spark e AWS GlueLaercio Serra
 

Semelhante a Internals (20)

Apache Spark - Running on a Cluster | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Running on a Cluster | Big Data Hadoop Spark Tutorial | CloudxLabApache Spark - Running on a Cluster | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Running on a Cluster | Big Data Hadoop Spark Tutorial | CloudxLab
 
Building Distributed Systems from Scratch - Part 1
Building Distributed Systems from Scratch - Part 1Building Distributed Systems from Scratch - Part 1
Building Distributed Systems from Scratch - Part 1
 
Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
 Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F... Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
 
Data Engineer's Lunch #80: Apache Spark Resource Managers
Data Engineer's Lunch #80: Apache Spark Resource ManagersData Engineer's Lunch #80: Apache Spark Resource Managers
Data Engineer's Lunch #80: Apache Spark Resource Managers
 
SCALABLE MONITORING USING PROMETHEUS WITH APACHE SPARK
SCALABLE MONITORING USING PROMETHEUS WITH APACHE SPARKSCALABLE MONITORING USING PROMETHEUS WITH APACHE SPARK
SCALABLE MONITORING USING PROMETHEUS WITH APACHE SPARK
 
Core Services behind Spark Job Execution
Core Services behind Spark Job ExecutionCore Services behind Spark Job Execution
Core Services behind Spark Job Execution
 
Apache spark - Installation
Apache spark - InstallationApache spark - Installation
Apache spark - Installation
 
Advanced task management with Celery
Advanced task management with CeleryAdvanced task management with Celery
Advanced task management with Celery
 
An Introduction to Apache Spark
An Introduction to Apache SparkAn Introduction to Apache Spark
An Introduction to Apache Spark
 
Apache Mesos: a simple explanation of basics
Apache Mesos: a simple explanation of basicsApache Mesos: a simple explanation of basics
Apache Mesos: a simple explanation of basics
 
Spark Driven Big Data Analytics
Spark Driven Big Data AnalyticsSpark Driven Big Data Analytics
Spark Driven Big Data Analytics
 
Spark 1.0
Spark 1.0Spark 1.0
Spark 1.0
 
Jab12 - Joomla! architecture revealed
Jab12 - Joomla! architecture revealedJab12 - Joomla! architecture revealed
Jab12 - Joomla! architecture revealed
 
Building distributed processing system from scratch - Part 2
Building distributed processing system from scratch - Part 2Building distributed processing system from scratch - Part 2
Building distributed processing system from scratch - Part 2
 
Spark on yarn
Spark on yarnSpark on yarn
Spark on yarn
 
Summer Internship Project - Remote Render
Summer Internship Project - Remote RenderSummer Internship Project - Remote Render
Summer Internship Project - Remote Render
 
What is Distributed Computing, Why we use Apache Spark
What is Distributed Computing, Why we use Apache SparkWhat is Distributed Computing, Why we use Apache Spark
What is Distributed Computing, Why we use Apache Spark
 
Apache Spark e AWS Glue
Apache Spark e AWS GlueApache Spark e AWS Glue
Apache Spark e AWS Glue
 
Apache Spark Core
Apache Spark CoreApache Spark Core
Apache Spark Core
 
SWT Tech Sharing: Node.js + Redis
SWT Tech Sharing: Node.js + RedisSWT Tech Sharing: Node.js + Redis
SWT Tech Sharing: Node.js + Redis
 

Último

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 

Último (20)

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 

Internals

  • 1. Apache Spark Internals Sandeep Purohit Software Consultant Knoldus Software LLP
  • 2. Agenda ● Architecture of spark cluster ● Tasks, Stages, Jobs ● DAG ● Execution Workflow ● Demo
  • 3. Architecture of Spark Cluster Driver Program (SparkContext) Cluster Manager Executer Executer Executer Executer Worker node Worker node Master node Standalone Yarn Mesos Executer
  • 4. ● Master Node: Master node is the node on which the driver program will be running i.e. the main() method of the application will be running. ● Worker Node: Worker node have an executor and cache which is responsible for running task. ● Executer: Executers are one responsible for running the tasks and also provide the memory to store RDD's. ● Driver Program: Driver program is responsible for 2 duties:creating tasks, scheduling tasks. ● Cluster manager: Cluster manager is responsible for monitoring the cluster and providing the resources to the executors.
  • 5. Tasks, Stages, Jobs ● Tasks: smallest individual unit of execution that represents a partition in a dataset. Partition 1 Partition 2 Partition 3 RDD Stage Task 1 Task 2 Task 3
  • 6. ● Stages: stages is the collection of tasks, whenever the shuffle will be happen the next task will be in different stage. Any transformation which create shuffleRDD Stage 1 Stage 2 Any transformation or action
  • 7. ● Jobs: jobs are the action which is submitted to the DAGScheduler by the spark driver to run task using the RDD lineag graph. RDD DAGScheduler executor
  • 8. DAG ● Spark Schedular create the DAG of the stages to send the DAG object to workers to evaluate the final result. map filter count repartition
  • 9. Execution workflow RDD object (Create DAG) DAG Schedular (Split graph into stages) Cluster Manager worker DAG Taskset Run Task
  • 11. Demo
  • 12. Q&A