SlideShare uma empresa Scribd logo
1 de 19
Baixar para ler offline
Big Data Processing With
Spark and Scala
http://www.edureka.co/apache-spark-scala-training
Slide 2Slide 2 http://www.edureka.co/apache-spark-scala-training
What is Big Data?
What is Spark?
Why Spark?
Spark Ecosystem
A note about Scala
Why Scala?
MapReduce vs Spark
Hello Spark!
Objectives of this Session
Slide 3Slide 3 http://www.edureka.co/apache-spark-scala-training
Big Data
 Lots of Data (Terabytes or Petabytes)
 Big data is the term for a collection of data sets
so large and complex that it becomes difficult to
process using on-hand database management
tools or traditional data processing applications
 The challenges include capture, curation,
storage, search, sharing, transfer, analysis, and
visualization
cloud
tools
statistics
No SQL
compression
storage
support
database
analyze
information
terabytes
processing
mobile
Big Data
Slide 4Slide 4 http://www.edureka.co/apache-spark-scala-training
What is Spark?
 Apache Spark is a general-purpose cluster in-memory computing system
 Provides high-level APIs in Java, Scala and Python, and an optimized engine that supports general execution graphs
 Provides various high level tools like Spark SQL for structured data processing, Mlib for Machine Learning and more..
High Level
APIs
High Level
Tools
More…
Slide 5Slide 5 http://www.edureka.co/apache-spark-scala-training
Why Spark?
Cluster Manager
Deployment
via YARN
 The Spark framework can be deployed through
Apache Mesos, Apache Hadoop via Yarn, or
Spark’s own cluster manager.
Slide 6Slide 6 http://www.edureka.co/apache-spark-scala-training
Why Spark?
Polyglot Scala
 Spark framework is polyglot – Can be programmed
in several programming languages (Currently
Scala, Java and Python supported).
Slide 7Slide 7 http://www.edureka.co/apache-spark-scala-training
Why Spark?
A fully Apache Hive compatible data
warehousing system that can run 100x
faster than Hive.
100x faster than for certain applications.
Slide 8Slide 8 http://www.edureka.co/apache-spark-scala-training
Why Spark?
 Provides powerful caching and disk persistence capabilities
 Interactive Data Analysis
 Faster Batch
 Iterative Algorithms
 Real-Time Stream Processing
 Faster Decision-Making
Slide 9Slide 9 http://www.edureka.co/apache-spark-scala-training
Spark Community is Super Active!
Slide 10Slide 10 http://www.edureka.co/apache-spark-scala-training
Spark Ecosystem
Spark Core Engine
Aplha/Pre-alpha
Shark
(SQL)
Spark
Streaming
(Streaming)
MLLib
(Machine
learning)
GraphX
(Graph
Computation)
SparkR
(R on Spark)
BlindDB
(Approximate
SQL)
Slide 11Slide 11 http://www.edureka.co/apache-spark-scala-training
Spark Ecosystem (Contd.)
Used for structured
data. Can run
unmodified hive
queries on existing
Hadoop
deployment.
Spark Core Engine
Aplha/Pre-alpha
Shark
(SQL)
Spark
Streaming
(Streaming)
MLLib
(Machine
learning)
GraphX
(Graph
Computation)
SparkR
(R on Spark)
BlindDB
(Approximate
SQL)
Enables analytical
and interactive
apps for live
streaming data.
An approximate
query engine. To
run over Core
Spark Engine.
Graph Computation
engine.
(Similar to Giraph)
Package for R language
to enable R-users to
leverage Spark power
from R shell.
Machine learning library being built on top of Spark. Provision for support to many
machine learning algorithms with speeds upto 100 times faster than Map-Reduce.
Slide 12Slide 12 http://www.edureka.co/apache-spark-scala-training
A Note on Scala
 Scala is a general-purpose programming language designed
to express common programming patterns in a concise,
elegant, and type-safe way
 Scala supports both Object Oriented Programming and
Functional Programming
 Scala is very much in fabric of present and Future Big Data
frameworks like Scalding, Spark, Akka
» All examples of Spark in class will be
covered in Scala
» Scala would be covered before Spark
coverage as part of course!
Slide 13Slide 13 http://www.edureka.co/apache-spark-scala-training
Why Scala?
 Scala is a pure object-oriented language. Conceptually, every value is an object and every operation is a
method-call. The language supports advanced component architectures through classes and traits
 Scala is also a functional language. Supports functions, immutable data structures and preference for
immutability over mutation
 Seamlessly integrated with Java
 Being used heavily for future Big data and developments frameworks like Spark, Akka, Scalding, Play etc
Slide 14Slide 14 http://www.edureka.co/apache-spark-scala-trainingSlide 14
 If you want to do some Real Time Analytics, where you are expecting result quickly, Hadoop
should not be used directly
 Hadoop works on Batch processing, hence response time is high
Day 1 Day 2 Day 3 Day 4 ......... ………. ………. Day n
Day 1 Day 2 Day 3 Day 4 ......... ………. ………. Day n
Input
Data
Processing
Data
Input
Data
Processing
Data
Input
Data
Processing
Data
Input Data
Processing Data using MR
Time Lag
Real Time Analytics
Slide 15Slide 15 http://www.edureka.co/apache-spark-scala-trainingSlide 15
Real Time Analytics – Accepted Way
Streaming
Data
Storing
Slide 16Slide 16 http://www.edureka.co/apache-spark-scala-trainingSlide 16
14 sec
0.6 sec
MapReduce vs Spark
Slide 17 http://www.edureka.co/apache-spark-scala-training
Spark Demo!
Spark Demo!
Slide 18 http://www.edureka.co/apache-spark-scala-training
Questions?
Big Data Processing with Spark and Scala

Mais conteúdo relacionado

Mais procurados

Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...Edureka!
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...Simplilearn
 
PySpark in practice slides
PySpark in practice slidesPySpark in practice slides
PySpark in practice slidesDat Tran
 
End-to-end Data Pipeline with Apache Spark
End-to-end Data Pipeline with Apache SparkEnd-to-end Data Pipeline with Apache Spark
End-to-end Data Pipeline with Apache SparkDatabricks
 
PySpark dataframe
PySpark dataframePySpark dataframe
PySpark dataframeJaemun Jung
 
Spark overview
Spark overviewSpark overview
Spark overviewLisa Hua
 
Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with PythonGokhan Atil
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupDatabricks
 
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Simplilearn
 
Introducing DataFrames in Spark for Large Scale Data Science
Introducing DataFrames in Spark for Large Scale Data ScienceIntroducing DataFrames in Spark for Large Scale Data Science
Introducing DataFrames in Spark for Large Scale Data ScienceDatabricks
 
Deep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache SparkDeep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache SparkDatabricks
 
Building Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache SparkBuilding Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache SparkDatabricks
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architectureSohil Jain
 
PySpark Best Practices
PySpark Best PracticesPySpark Best Practices
PySpark Best PracticesCloudera, Inc.
 
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...Simplilearn
 

Mais procurados (20)

Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
Pyspark Tutorial | Introduction to Apache Spark with Python | PySpark Trainin...
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
 
PySpark in practice slides
PySpark in practice slidesPySpark in practice slides
PySpark in practice slides
 
End-to-end Data Pipeline with Apache Spark
End-to-end Data Pipeline with Apache SparkEnd-to-end Data Pipeline with Apache Spark
End-to-end Data Pipeline with Apache Spark
 
PySpark dataframe
PySpark dataframePySpark dataframe
PySpark dataframe
 
Spark overview
Spark overviewSpark overview
Spark overview
 
Spark architecture
Spark architectureSpark architecture
Spark architecture
 
Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with Python
 
Apache spark
Apache sparkApache spark
Apache spark
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark Meetup
 
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
 
Introducing DataFrames in Spark for Large Scale Data Science
Introducing DataFrames in Spark for Large Scale Data ScienceIntroducing DataFrames in Spark for Large Scale Data Science
Introducing DataFrames in Spark for Large Scale Data Science
 
Deep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache SparkDeep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache Spark
 
Spark
SparkSpark
Spark
 
Apache Spark PDF
Apache Spark PDFApache Spark PDF
Apache Spark PDF
 
Building Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache SparkBuilding Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache Spark
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
 
PySpark Best Practices
PySpark Best PracticesPySpark Best Practices
PySpark Best Practices
 
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
 

Destaque

Spark for big data analytics
Spark for big data analyticsSpark for big data analytics
Spark for big data analyticsEdureka!
 
Power of Python with Big Data
Power of Python with Big DataPower of Python with Big Data
Power of Python with Big DataEdureka!
 
R and Visualization: A match made in Heaven
R and Visualization: A match made in HeavenR and Visualization: A match made in Heaven
R and Visualization: A match made in HeavenEdureka!
 
Big Data Analytics for Non-Programmers
Big Data Analytics for Non-ProgrammersBig Data Analytics for Non-Programmers
Big Data Analytics for Non-ProgrammersEdureka!
 
Mastering in data warehousing & BusinessIintelligence
Mastering in data warehousing & BusinessIintelligenceMastering in data warehousing & BusinessIintelligence
Mastering in data warehousing & BusinessIintelligenceEdureka!
 
Is Data Scientist still the sexiest job of 21st century? Find Out!
Is Data Scientist still the sexiest job of 21st century? Find Out!Is Data Scientist still the sexiest job of 21st century? Find Out!
Is Data Scientist still the sexiest job of 21st century? Find Out!Edureka!
 
Top 5 algorithms used in Data Science
Top 5 algorithms used in Data ScienceTop 5 algorithms used in Data Science
Top 5 algorithms used in Data ScienceEdureka!
 
Clare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science OnlineClare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science Onlinesfdatascience
 
Health care and big data with hadoop – Beacuse prevention is better than cure
Health care and big data with hadoop – Beacuse prevention is better than cureHealth care and big data with hadoop – Beacuse prevention is better than cure
Health care and big data with hadoop – Beacuse prevention is better than cureEdureka!
 
Python for Big Data Analytics
Python for Big Data AnalyticsPython for Big Data Analytics
Python for Big Data AnalyticsEdureka!
 
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...Edureka!
 
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Edureka!
 

Destaque (12)

Spark for big data analytics
Spark for big data analyticsSpark for big data analytics
Spark for big data analytics
 
Power of Python with Big Data
Power of Python with Big DataPower of Python with Big Data
Power of Python with Big Data
 
R and Visualization: A match made in Heaven
R and Visualization: A match made in HeavenR and Visualization: A match made in Heaven
R and Visualization: A match made in Heaven
 
Big Data Analytics for Non-Programmers
Big Data Analytics for Non-ProgrammersBig Data Analytics for Non-Programmers
Big Data Analytics for Non-Programmers
 
Mastering in data warehousing & BusinessIintelligence
Mastering in data warehousing & BusinessIintelligenceMastering in data warehousing & BusinessIintelligence
Mastering in data warehousing & BusinessIintelligence
 
Is Data Scientist still the sexiest job of 21st century? Find Out!
Is Data Scientist still the sexiest job of 21st century? Find Out!Is Data Scientist still the sexiest job of 21st century? Find Out!
Is Data Scientist still the sexiest job of 21st century? Find Out!
 
Top 5 algorithms used in Data Science
Top 5 algorithms used in Data ScienceTop 5 algorithms used in Data Science
Top 5 algorithms used in Data Science
 
Clare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science OnlineClare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science Online
 
Health care and big data with hadoop – Beacuse prevention is better than cure
Health care and big data with hadoop – Beacuse prevention is better than cureHealth care and big data with hadoop – Beacuse prevention is better than cure
Health care and big data with hadoop – Beacuse prevention is better than cure
 
Python for Big Data Analytics
Python for Big Data AnalyticsPython for Big Data Analytics
Python for Big Data Analytics
 
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...
 
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
 

Semelhante a Big Data Processing with Spark and Scala

Big data Processing with Apache Spark & Scala
Big data Processing with Apache Spark & ScalaBig data Processing with Apache Spark & Scala
Big data Processing with Apache Spark & ScalaEdureka!
 
Spark SQL | Apache Spark
Spark SQL | Apache SparkSpark SQL | Apache Spark
Spark SQL | Apache SparkEdureka!
 
Big Data Processing With Spark
Big Data Processing With SparkBig Data Processing With Spark
Big Data Processing With SparkEdureka!
 
5 reasons why spark is in demand!
5 reasons why spark is in demand!5 reasons why spark is in demand!
5 reasons why spark is in demand!Edureka!
 
5 things one must know about spark!
5 things one must know about spark!5 things one must know about spark!
5 things one must know about spark!Edureka!
 
Spark For Faster Batch Processing
Spark For Faster Batch ProcessingSpark For Faster Batch Processing
Spark For Faster Batch ProcessingEdureka!
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Edureka!
 
Apache Spark & Scala
Apache Spark & ScalaApache Spark & Scala
Apache Spark & ScalaEdureka!
 
5 things one must know about spark!
5 things one must know about spark!5 things one must know about spark!
5 things one must know about spark!Edureka!
 
Scalable Machine Learning with PySpark
Scalable Machine Learning with PySparkScalable Machine Learning with PySpark
Scalable Machine Learning with PySparkLadle Patel
 
Apache Spark beyond Hadoop MapReduce
Apache Spark beyond Hadoop MapReduceApache Spark beyond Hadoop MapReduce
Apache Spark beyond Hadoop MapReduceEdureka!
 
Spark Streaming
Spark StreamingSpark Streaming
Spark StreamingEdureka!
 
5 Reasons why Spark is in demand!
5 Reasons why Spark is in demand!5 Reasons why Spark is in demand!
5 Reasons why Spark is in demand!Edureka!
 
Performance of Spark vs MapReduce
Performance of Spark vs MapReducePerformance of Spark vs MapReduce
Performance of Spark vs MapReduceEdureka!
 
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...Edureka!
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideWhizlabs
 
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...Edureka!
 

Semelhante a Big Data Processing with Spark and Scala (20)

Big data Processing with Apache Spark & Scala
Big data Processing with Apache Spark & ScalaBig data Processing with Apache Spark & Scala
Big data Processing with Apache Spark & Scala
 
Spark SQL | Apache Spark
Spark SQL | Apache SparkSpark SQL | Apache Spark
Spark SQL | Apache Spark
 
Big Data Processing With Spark
Big Data Processing With SparkBig Data Processing With Spark
Big Data Processing With Spark
 
5 reasons why spark is in demand!
5 reasons why spark is in demand!5 reasons why spark is in demand!
5 reasons why spark is in demand!
 
5 things one must know about spark!
5 things one must know about spark!5 things one must know about spark!
5 things one must know about spark!
 
Spark For Faster Batch Processing
Spark For Faster Batch ProcessingSpark For Faster Batch Processing
Spark For Faster Batch Processing
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
 
Apache Spark & Scala
Apache Spark & ScalaApache Spark & Scala
Apache Spark & Scala
 
5 things one must know about spark!
5 things one must know about spark!5 things one must know about spark!
5 things one must know about spark!
 
Scalable Machine Learning with PySpark
Scalable Machine Learning with PySparkScalable Machine Learning with PySpark
Scalable Machine Learning with PySpark
 
Module01
 Module01 Module01
Module01
 
Apache spark
Apache sparkApache spark
Apache spark
 
Apache Spark beyond Hadoop MapReduce
Apache Spark beyond Hadoop MapReduceApache Spark beyond Hadoop MapReduce
Apache Spark beyond Hadoop MapReduce
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
Spark Streaming
Spark StreamingSpark Streaming
Spark Streaming
 
5 Reasons why Spark is in demand!
5 Reasons why Spark is in demand!5 Reasons why Spark is in demand!
5 Reasons why Spark is in demand!
 
Performance of Spark vs MapReduce
Performance of Spark vs MapReducePerformance of Spark vs MapReduce
Performance of Spark vs MapReduce
 
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive Guide
 
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
 

Mais de Edureka!

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaEdureka!
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaEdureka!
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaEdureka!
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaEdureka!
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaEdureka!
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaEdureka!
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaEdureka!
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaEdureka!
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaEdureka!
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaEdureka!
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | EdurekaEdureka!
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEdureka!
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEdureka!
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaEdureka!
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaEdureka!
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaEdureka!
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Edureka!
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaEdureka!
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaEdureka!
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | EdurekaEdureka!
 

Mais de Edureka! (20)

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | Edureka
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | Edureka
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | Edureka
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | Edureka
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | Edureka
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| Edureka
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | Edureka
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | Edureka
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | Edureka
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | Edureka
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | Edureka
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | Edureka
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | Edureka
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | Edureka
 

Último

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 

Último (20)

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

Big Data Processing with Spark and Scala

  • 1. Big Data Processing With Spark and Scala http://www.edureka.co/apache-spark-scala-training
  • 2. Slide 2Slide 2 http://www.edureka.co/apache-spark-scala-training What is Big Data? What is Spark? Why Spark? Spark Ecosystem A note about Scala Why Scala? MapReduce vs Spark Hello Spark! Objectives of this Session
  • 3. Slide 3Slide 3 http://www.edureka.co/apache-spark-scala-training Big Data  Lots of Data (Terabytes or Petabytes)  Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications  The challenges include capture, curation, storage, search, sharing, transfer, analysis, and visualization cloud tools statistics No SQL compression storage support database analyze information terabytes processing mobile Big Data
  • 4. Slide 4Slide 4 http://www.edureka.co/apache-spark-scala-training What is Spark?  Apache Spark is a general-purpose cluster in-memory computing system  Provides high-level APIs in Java, Scala and Python, and an optimized engine that supports general execution graphs  Provides various high level tools like Spark SQL for structured data processing, Mlib for Machine Learning and more.. High Level APIs High Level Tools More…
  • 5. Slide 5Slide 5 http://www.edureka.co/apache-spark-scala-training Why Spark? Cluster Manager Deployment via YARN  The Spark framework can be deployed through Apache Mesos, Apache Hadoop via Yarn, or Spark’s own cluster manager.
  • 6. Slide 6Slide 6 http://www.edureka.co/apache-spark-scala-training Why Spark? Polyglot Scala  Spark framework is polyglot – Can be programmed in several programming languages (Currently Scala, Java and Python supported).
  • 7. Slide 7Slide 7 http://www.edureka.co/apache-spark-scala-training Why Spark? A fully Apache Hive compatible data warehousing system that can run 100x faster than Hive. 100x faster than for certain applications.
  • 8. Slide 8Slide 8 http://www.edureka.co/apache-spark-scala-training Why Spark?  Provides powerful caching and disk persistence capabilities  Interactive Data Analysis  Faster Batch  Iterative Algorithms  Real-Time Stream Processing  Faster Decision-Making
  • 9. Slide 9Slide 9 http://www.edureka.co/apache-spark-scala-training Spark Community is Super Active!
  • 10. Slide 10Slide 10 http://www.edureka.co/apache-spark-scala-training Spark Ecosystem Spark Core Engine Aplha/Pre-alpha Shark (SQL) Spark Streaming (Streaming) MLLib (Machine learning) GraphX (Graph Computation) SparkR (R on Spark) BlindDB (Approximate SQL)
  • 11. Slide 11Slide 11 http://www.edureka.co/apache-spark-scala-training Spark Ecosystem (Contd.) Used for structured data. Can run unmodified hive queries on existing Hadoop deployment. Spark Core Engine Aplha/Pre-alpha Shark (SQL) Spark Streaming (Streaming) MLLib (Machine learning) GraphX (Graph Computation) SparkR (R on Spark) BlindDB (Approximate SQL) Enables analytical and interactive apps for live streaming data. An approximate query engine. To run over Core Spark Engine. Graph Computation engine. (Similar to Giraph) Package for R language to enable R-users to leverage Spark power from R shell. Machine learning library being built on top of Spark. Provision for support to many machine learning algorithms with speeds upto 100 times faster than Map-Reduce.
  • 12. Slide 12Slide 12 http://www.edureka.co/apache-spark-scala-training A Note on Scala  Scala is a general-purpose programming language designed to express common programming patterns in a concise, elegant, and type-safe way  Scala supports both Object Oriented Programming and Functional Programming  Scala is very much in fabric of present and Future Big Data frameworks like Scalding, Spark, Akka » All examples of Spark in class will be covered in Scala » Scala would be covered before Spark coverage as part of course!
  • 13. Slide 13Slide 13 http://www.edureka.co/apache-spark-scala-training Why Scala?  Scala is a pure object-oriented language. Conceptually, every value is an object and every operation is a method-call. The language supports advanced component architectures through classes and traits  Scala is also a functional language. Supports functions, immutable data structures and preference for immutability over mutation  Seamlessly integrated with Java  Being used heavily for future Big data and developments frameworks like Spark, Akka, Scalding, Play etc
  • 14. Slide 14Slide 14 http://www.edureka.co/apache-spark-scala-trainingSlide 14  If you want to do some Real Time Analytics, where you are expecting result quickly, Hadoop should not be used directly  Hadoop works on Batch processing, hence response time is high Day 1 Day 2 Day 3 Day 4 ......... ………. ………. Day n Day 1 Day 2 Day 3 Day 4 ......... ………. ………. Day n Input Data Processing Data Input Data Processing Data Input Data Processing Data Input Data Processing Data using MR Time Lag Real Time Analytics
  • 15. Slide 15Slide 15 http://www.edureka.co/apache-spark-scala-trainingSlide 15 Real Time Analytics – Accepted Way Streaming Data Storing
  • 16. Slide 16Slide 16 http://www.edureka.co/apache-spark-scala-trainingSlide 16 14 sec 0.6 sec MapReduce vs Spark