SlideShare uma empresa Scribd logo
1 de 18
Hadoop
Development
Series
By Sandeep Patil
4/11/2017 1Footer Text
Introduction to Big Data
and Hadoop
4/11/2017Footer Text 2
What is Big Data??
• Large amount of Data .
• Its a popular term used to express exponential growth of
data .
• Big data is difficult to store , collect , maintain , Analyze
and Visualize .
4/11/2017Footer Text 3
Big Data characteristics
• Volume :-
Large amount of data .
• Velocity :-
The rate at which data is getting generated
• Variety :-
Different types of Data
- Structured data ,eg MySql
- Semi-Structured data, eg xml , json
- Unstructured data, eg text , audio, video
4/11/2017Footer Text 4
Big Data sources
• Social Media
• Banks
• Instruments
• Websites
• Stock Market
4/11/2017Footer Text 5
Use cases of Big Data
• Recommendation engines
• Analyzing Call Detail Record(CDR)
• Fraud Detection
• Market Basket Analysis
• Sentimental Analysis
4/11/2017Footer Text 6
Hadoop Introduction
• Open source framework that allows distributed
processing of large datasets on the cluster of commodity
hardware
• Hadoop is a data management tool and uses scale out
storage .
4/11/2017Footer Text 7
Defining Hadoop Cluster
• Size of data is most important factor while defining
hadoop cluster
4/11/2017Footer Text 8
5 Servers with 10 TB storage
capacity each
Total Storage Capacity : - 50TB
Defining Hadoop Cluster
4/11/2017Footer Text 9
7 Servers with 10 TB storage
capacity each
Total storage capacity : 70TB
Hadoop Components
• Hadoop 1 Componets
- HDFS (Hadoop distributed file system)
- MapReduce
• Hadoop 2 Component
- HDFS (Hadoop distributed file system)
- YARN/MRv2
4/11/2017Footer Text 10
HDFS
MR/
YARN
Storage/
Reads-Writes
Processing
Hadoop Daemons
• Hadoop 1 Daemos
Namenode
Datanode
Secondary Namenode
job Tracker
Task Tracker
4/11/2017Footer Text 11
HDFS MapReduce
NameNode
DataNode
Job Tracker
Task Tracker
Hadoop Daemons
• Hadoop 2 Daemos
Namenode
Datanode
Secondary Namenode
Resource Manager
Node Manager
4/11/2017Footer Text 12
HDFS MapReduce
NameNode
DataNode
Resource Manager
Node Manager
Hadoop Master Slave
Architecture
4/11/2017Footer Text 13
HDFS MR/YARN
NameNode DataNode ResourceManager NodeManager
Master Slave Master Slave
Hadoop Cluster
• Assume that we have hadoop cluster with 4 nodes
4/11/2017Footer Text 14
Master
NameNode
ResourceManager
Slave
DataNode
NodeManager
Secondary Name Node
• Secondary Namenode is not a hot backup for Namenode
.
• It just takes hourly backup of Namenode metadata
• It is can be used to Restart a crashed Hadoop Cluster
• Secondary Namenode is an important demon for
Hadoop1 , However in hadoop2 It is not that much
Important .
4/11/2017Footer Text 15
Modes of Operation
• Stand Alone
• Pseudo Distributed
• Fully Distributed
4/11/2017Footer Text 16
Next Video
• Comparison between Hadoop1 and Hadoop2
4/11/2017Footer Text 17
Like and Subscribe
4/11/2017Footer Text 18
sdp117@gmail.com

Mais conteúdo relacionado

Mais procurados

Hadoop training by keylabs
Hadoop training by keylabsHadoop training by keylabs
Hadoop training by keylabsSiva Sankar
 
Redis memory optimization sripathi, CTO hashedin
Redis memory optimization   sripathi, CTO hashedinRedis memory optimization   sripathi, CTO hashedin
Redis memory optimization sripathi, CTO hashedinHashedIn Technologies
 
Intro to Hadoop and MapReduce
Intro to Hadoop and MapReduceIntro to Hadoop and MapReduce
Intro to Hadoop and MapReduceJosi Aranda
 
Getting started with big data in Azure HDInsight
Getting started with big data in Azure HDInsightGetting started with big data in Azure HDInsight
Getting started with big data in Azure HDInsightNilesh Gule
 
What is HDFS | Hadoop Distributed File System | Edureka
What is HDFS | Hadoop Distributed File System | EdurekaWhat is HDFS | Hadoop Distributed File System | Edureka
What is HDFS | Hadoop Distributed File System | EdurekaEdureka!
 
Basic Hadoop Architecture V1 vs V2
Basic  Hadoop Architecture  V1 vs V2Basic  Hadoop Architecture  V1 vs V2
Basic Hadoop Architecture V1 vs V2VIVEKVANAVAN
 
Alluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the CloudAlluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the CloudShubham Tagra
 
Apache Con Eu2008 Hadoop Tour Tom White
Apache Con Eu2008 Hadoop Tour Tom WhiteApache Con Eu2008 Hadoop Tour Tom White
Apache Con Eu2008 Hadoop Tour Tom Whitetomwhite
 
Why Your MongoDB Needs Redis
Why Your MongoDB Needs RedisWhy Your MongoDB Needs Redis
Why Your MongoDB Needs RedisItamar Haber
 

Mais procurados (20)

Hadoop training by keylabs
Hadoop training by keylabsHadoop training by keylabs
Hadoop training by keylabs
 
Redis memory optimization sripathi, CTO hashedin
Redis memory optimization   sripathi, CTO hashedinRedis memory optimization   sripathi, CTO hashedin
Redis memory optimization sripathi, CTO hashedin
 
Big data
Big dataBig data
Big data
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
Redis database
Redis databaseRedis database
Redis database
 
Intro to Hadoop and MapReduce
Intro to Hadoop and MapReduceIntro to Hadoop and MapReduce
Intro to Hadoop and MapReduce
 
Redis as database - HashedIn
Redis as database - HashedInRedis as database - HashedIn
Redis as database - HashedIn
 
IDL Support of HDF
IDL Support of HDFIDL Support of HDF
IDL Support of HDF
 
Hadoop Technology
Hadoop TechnologyHadoop Technology
Hadoop Technology
 
Getting started with big data in Azure HDInsight
Getting started with big data in Azure HDInsightGetting started with big data in Azure HDInsight
Getting started with big data in Azure HDInsight
 
What is HDFS | Hadoop Distributed File System | Edureka
What is HDFS | Hadoop Distributed File System | EdurekaWhat is HDFS | Hadoop Distributed File System | Edureka
What is HDFS | Hadoop Distributed File System | Edureka
 
Basic Hadoop Architecture V1 vs V2
Basic  Hadoop Architecture  V1 vs V2Basic  Hadoop Architecture  V1 vs V2
Basic Hadoop Architecture V1 vs V2
 
Big data and hadoop anupama
Big data and hadoop anupamaBig data and hadoop anupama
Big data and hadoop anupama
 
Alluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the CloudAlluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the Cloud
 
Integrating HDF5 with SRB
Integrating HDF5 with SRBIntegrating HDF5 with SRB
Integrating HDF5 with SRB
 
Introduce to spark
Introduce to sparkIntroduce to spark
Introduce to spark
 
Apache Con Eu2008 Hadoop Tour Tom White
Apache Con Eu2008 Hadoop Tour Tom WhiteApache Con Eu2008 Hadoop Tour Tom White
Apache Con Eu2008 Hadoop Tour Tom White
 
Why Your MongoDB Needs Redis
Why Your MongoDB Needs RedisWhy Your MongoDB Needs Redis
Why Your MongoDB Needs Redis
 
Practical Use of a NoSQL
Practical Use of a NoSQLPractical Use of a NoSQL
Practical Use of a NoSQL
 
HDF5 Performance Enhancements with the Elimination of Unlimited Dimension
HDF5 Performance Enhancements with the Elimination of Unlimited DimensionHDF5 Performance Enhancements with the Elimination of Unlimited Dimension
HDF5 Performance Enhancements with the Elimination of Unlimited Dimension
 

Semelhante a Introduction to Big Data and Hadoop

Hadoop development series(1)
Hadoop development series(1)Hadoop development series(1)
Hadoop development series(1)Amar kumar
 
Aziksa hadoop architecture santosh jha
Aziksa hadoop architecture santosh jhaAziksa hadoop architecture santosh jha
Aziksa hadoop architecture santosh jhaData Con LA
 
Introduction to BIg Data and Hadoop
Introduction to BIg Data and HadoopIntroduction to BIg Data and Hadoop
Introduction to BIg Data and HadoopAmir Shaikh
 
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Simplilearn
 
Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Apache hadoop: POSH Meetup Palo Alto, CA April 2014Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Apache hadoop: POSH Meetup Palo Alto, CA April 2014Kevin Crocker
 
Apache Hadoop Big Data Technology
Apache Hadoop Big Data TechnologyApache Hadoop Big Data Technology
Apache Hadoop Big Data TechnologyJay Nagar
 
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.pptHADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.pptManiMaran230751
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File SystemNilaNila16
 
An introduction toClass Presentation byDamon A. Runion.docx
An introduction toClass Presentation byDamon A. Runion.docxAn introduction toClass Presentation byDamon A. Runion.docx
An introduction toClass Presentation byDamon A. Runion.docxgreg1eden90113
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduceDerek Chen
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Ranjith Sekar
 

Semelhante a Introduction to Big Data and Hadoop (20)

Hadoop development series(1)
Hadoop development series(1)Hadoop development series(1)
Hadoop development series(1)
 
Aziksa hadoop architecture santosh jha
Aziksa hadoop architecture santosh jhaAziksa hadoop architecture santosh jha
Aziksa hadoop architecture santosh jha
 
Introduction to BIg Data and Hadoop
Introduction to BIg Data and HadoopIntroduction to BIg Data and Hadoop
Introduction to BIg Data and Hadoop
 
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
 
Big data
Big dataBig data
Big data
 
Unit IV.pdf
Unit IV.pdfUnit IV.pdf
Unit IV.pdf
 
Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Apache hadoop: POSH Meetup Palo Alto, CA April 2014Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Apache hadoop: POSH Meetup Palo Alto, CA April 2014
 
Apache Hadoop Big Data Technology
Apache Hadoop Big Data TechnologyApache Hadoop Big Data Technology
Apache Hadoop Big Data Technology
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop training in bangalore
Hadoop training in bangaloreHadoop training in bangalore
Hadoop training in bangalore
 
Hadoop
HadoopHadoop
Hadoop
 
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.pptHADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
An introduction toClass Presentation byDamon A. Runion.docx
An introduction toClass Presentation byDamon A. Runion.docxAn introduction toClass Presentation byDamon A. Runion.docx
An introduction toClass Presentation byDamon A. Runion.docx
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduce
 
List of Engineering Colleges in Uttarakhand
List of Engineering Colleges in UttarakhandList of Engineering Colleges in Uttarakhand
List of Engineering Colleges in Uttarakhand
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016
 

Último

ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 

Último (20)

ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 

Introduction to Big Data and Hadoop

  • 2. Introduction to Big Data and Hadoop 4/11/2017Footer Text 2
  • 3. What is Big Data?? • Large amount of Data . • Its a popular term used to express exponential growth of data . • Big data is difficult to store , collect , maintain , Analyze and Visualize . 4/11/2017Footer Text 3
  • 4. Big Data characteristics • Volume :- Large amount of data . • Velocity :- The rate at which data is getting generated • Variety :- Different types of Data - Structured data ,eg MySql - Semi-Structured data, eg xml , json - Unstructured data, eg text , audio, video 4/11/2017Footer Text 4
  • 5. Big Data sources • Social Media • Banks • Instruments • Websites • Stock Market 4/11/2017Footer Text 5
  • 6. Use cases of Big Data • Recommendation engines • Analyzing Call Detail Record(CDR) • Fraud Detection • Market Basket Analysis • Sentimental Analysis 4/11/2017Footer Text 6
  • 7. Hadoop Introduction • Open source framework that allows distributed processing of large datasets on the cluster of commodity hardware • Hadoop is a data management tool and uses scale out storage . 4/11/2017Footer Text 7
  • 8. Defining Hadoop Cluster • Size of data is most important factor while defining hadoop cluster 4/11/2017Footer Text 8 5 Servers with 10 TB storage capacity each Total Storage Capacity : - 50TB
  • 9. Defining Hadoop Cluster 4/11/2017Footer Text 9 7 Servers with 10 TB storage capacity each Total storage capacity : 70TB
  • 10. Hadoop Components • Hadoop 1 Componets - HDFS (Hadoop distributed file system) - MapReduce • Hadoop 2 Component - HDFS (Hadoop distributed file system) - YARN/MRv2 4/11/2017Footer Text 10 HDFS MR/ YARN Storage/ Reads-Writes Processing
  • 11. Hadoop Daemons • Hadoop 1 Daemos Namenode Datanode Secondary Namenode job Tracker Task Tracker 4/11/2017Footer Text 11 HDFS MapReduce NameNode DataNode Job Tracker Task Tracker
  • 12. Hadoop Daemons • Hadoop 2 Daemos Namenode Datanode Secondary Namenode Resource Manager Node Manager 4/11/2017Footer Text 12 HDFS MapReduce NameNode DataNode Resource Manager Node Manager
  • 13. Hadoop Master Slave Architecture 4/11/2017Footer Text 13 HDFS MR/YARN NameNode DataNode ResourceManager NodeManager Master Slave Master Slave
  • 14. Hadoop Cluster • Assume that we have hadoop cluster with 4 nodes 4/11/2017Footer Text 14 Master NameNode ResourceManager Slave DataNode NodeManager
  • 15. Secondary Name Node • Secondary Namenode is not a hot backup for Namenode . • It just takes hourly backup of Namenode metadata • It is can be used to Restart a crashed Hadoop Cluster • Secondary Namenode is an important demon for Hadoop1 , However in hadoop2 It is not that much Important . 4/11/2017Footer Text 15
  • 16. Modes of Operation • Stand Alone • Pseudo Distributed • Fully Distributed 4/11/2017Footer Text 16
  • 17. Next Video • Comparison between Hadoop1 and Hadoop2 4/11/2017Footer Text 17
  • 18. Like and Subscribe 4/11/2017Footer Text 18 sdp117@gmail.com