SlideShare uma empresa Scribd logo
1 de 16
Introducing
The Hadoop Ecosystem
The Hadoop Ecosystem
Context: Performance Gap Trend




                            Introduction to the Hadoop Ecosystem
                                                                   2
Context: Exponential for Decades
 Abundance of
 - computing & storage
 - generated data (estimated 8ZB in ’15)
 - things
 More data provides greater value
 Traditional data doesn’t scale well
 It’s time for a new approach!




                                           Introduction to the Hadoop Ecosystem
                                                                                  3
New Hardware Approach
Traditional               Big Data
 Exotic HW                 Commodity HW
  - big central servers   -racks of pizza boxes
  - SAN                   -Ethernet
  - RAID                  -JBOD
 Hardware reliability      Unreliable HW
                           Scales further
 Limited scalability
                           Cost effective
 Expensive



                                   Introduction to the Hadoop Ecosystem
                                                                          4
New Software Approach
Traditional         Big Data
 Monolotic           Distributed
  - Centralized     -storage & compute nodes
  - RDBMS               Raw data
 Schema first           Open source
 Proprietary




                               Introduction to the Hadoop Ecosystem
                                                                      5
Hadoop
 De facto big data industry standard (batch)
 Vendor adoption
 - IBM, Microsoft, Oracle, EMC, ...
 A collection of projects at Apache
 - HDFS, MapReduce, Hive, Pig, Hbase, Flume, Oozie, ...
 Main components
 - HDFS
 - MapReduce
 Cluster
   Set of machines running HDFS and MapReduce

                                       Introduction to the Hadoop Ecosystem
                                                                              6
HDFS




       Introduction to the Hadoop Ecosystem
                                              7
MapReduce




            Introduction to the Hadoop Ecosystem
                                                   8
MapReduce




            Introduction to the Hadoop Ecosystem
                                                   9
MapReduce




            Introduction to the Hadoop Ecosystem
                                                   10
Typical Adoption Pattern
 An idea that’s impractical without Hadoop
 Build Hadoop-based POC
 Move initial application to production
 Add more datasets and users
 - removing data silos in organizations
 - permitting easy experiments on real data
 Snowballs into institution’s central repository for
 - analysis
   data processing
   data service layer

                                         Introduction to the Hadoop Ecosystem
                                                                                11
Use Case 1: Truvo




                    Introduction to the Hadoop Ecosystem
                                                           12
Use Case 2: UZ Brussel




                         Introduction to the Hadoop Ecosystem
                                                                13
How can you use Hadoop?
 What data are you ignoring?
 - How can you use it?

 How can you combine internal and external data?
 -   Business partners
 -   Feedback from you customers through social media
 -   End your data silos
 -   ...




                                         Introduction to the Hadoop Ecosystem
                                                                                14
DataCrunchers - Big Data Enablers




                              Introduction to the Hadoop Ecosystem
                                                                     15
Introduction to the Hadoop Ecosystem
                                       16

Mais conteúdo relacionado

Mais procurados

Learning How to Learn Hadoop
Learning How to Learn HadoopLearning How to Learn Hadoop
Learning How to Learn Hadoop
Silicon Halton
 

Mais procurados (20)

HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
 
Hadoop - Architectural road map for Hadoop Ecosystem
Hadoop -  Architectural road map for Hadoop EcosystemHadoop -  Architectural road map for Hadoop Ecosystem
Hadoop - Architectural road map for Hadoop Ecosystem
 
Hadoop core concepts
Hadoop core conceptsHadoop core concepts
Hadoop core concepts
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
Hadoop and big data
Hadoop and big dataHadoop and big data
Hadoop and big data
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Big data & hadoop
Big data & hadoopBig data & hadoop
Big data & hadoop
 
Learning How to Learn Hadoop
Learning How to Learn HadoopLearning How to Learn Hadoop
Learning How to Learn Hadoop
 
Apache Hadoop at 10
Apache Hadoop at 10Apache Hadoop at 10
Apache Hadoop at 10
 
Big Data and Hadoop Basics
Big Data and Hadoop BasicsBig Data and Hadoop Basics
Big Data and Hadoop Basics
 
Hadoop: Distributed Data Processing
Hadoop: Distributed Data ProcessingHadoop: Distributed Data Processing
Hadoop: Distributed Data Processing
 
Hadoop for beginners free course ppt
Hadoop for beginners   free course pptHadoop for beginners   free course ppt
Hadoop for beginners free course ppt
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Introduction to Bigdata and HADOOP
Introduction to Bigdata and HADOOP Introduction to Bigdata and HADOOP
Introduction to Bigdata and HADOOP
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
 
What is hadoop
What is hadoopWhat is hadoop
What is hadoop
 
PPT on Hadoop
PPT on HadoopPPT on Hadoop
PPT on Hadoop
 
Hadoop Seminar Report
Hadoop Seminar ReportHadoop Seminar Report
Hadoop Seminar Report
 
Hadoop and Big Data
Hadoop and Big DataHadoop and Big Data
Hadoop and Big Data
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecase
 

Semelhante a Introducing the hadoop ecosystem

Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Joey Jablonski
 
Hw09 Data Processing In The Enterprise
Hw09   Data Processing In The EnterpriseHw09   Data Processing In The Enterprise
Hw09 Data Processing In The Enterprise
Cloudera, Inc.
 
Why hadoop for data science?
Why hadoop for data science?Why hadoop for data science?
Why hadoop for data science?
Hortonworks
 
Attaching cloud storage to a campus grid using parrot, chirp, and hadoop
Attaching cloud storage to a campus grid using parrot, chirp, and hadoopAttaching cloud storage to a campus grid using parrot, chirp, and hadoop
Attaching cloud storage to a campus grid using parrot, chirp, and hadoop
João Gabriel Lima
 
Big Data in the Microsoft Platform
Big Data in the Microsoft PlatformBig Data in the Microsoft Platform
Big Data in the Microsoft Platform
Jesus Rodriguez
 
The Forrester Wave Enterprise Hadoop Solutions Q1 2012
The Forrester Wave Enterprise Hadoop Solutions Q1 2012The Forrester Wave Enterprise Hadoop Solutions Q1 2012
The Forrester Wave Enterprise Hadoop Solutions Q1 2012
m_hepburn
 

Semelhante a Introducing the hadoop ecosystem (20)

Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hadoop Business Cases
Hadoop Business CasesHadoop Business Cases
Hadoop Business Cases
 
Hadoop Overview
Hadoop Overview Hadoop Overview
Hadoop Overview
 
Hw09 Data Processing In The Enterprise
Hw09   Data Processing In The EnterpriseHw09   Data Processing In The Enterprise
Hw09 Data Processing In The Enterprise
 
Why hadoop for data science?
Why hadoop for data science?Why hadoop for data science?
Why hadoop for data science?
 
White Paper: Hadoop in Life Sciences — An Introduction
White Paper: Hadoop in Life Sciences — An Introduction   White Paper: Hadoop in Life Sciences — An Introduction
White Paper: Hadoop in Life Sciences — An Introduction
 
Hadoop.powerpoint.pptx
Hadoop.powerpoint.pptxHadoop.powerpoint.pptx
Hadoop.powerpoint.pptx
 
Bigdata and hadoop
Bigdata and hadoopBigdata and hadoop
Bigdata and hadoop
 
Hadoop essentials by shiva achari - sample chapter
Hadoop essentials by shiva achari - sample chapterHadoop essentials by shiva achari - sample chapter
Hadoop essentials by shiva achari - sample chapter
 
Hadoop and Big Data: Revealed
Hadoop and Big Data: RevealedHadoop and Big Data: Revealed
Hadoop and Big Data: Revealed
 
Attaching cloud storage to a campus grid using parrot, chirp, and hadoop
Attaching cloud storage to a campus grid using parrot, chirp, and hadoopAttaching cloud storage to a campus grid using parrot, chirp, and hadoop
Attaching cloud storage to a campus grid using parrot, chirp, and hadoop
 
Demystify Big Data Breakfast Briefing: Herb Cunitz, Hortonworks
Demystify Big Data Breakfast Briefing:  Herb Cunitz, HortonworksDemystify Big Data Breakfast Briefing:  Herb Cunitz, Hortonworks
Demystify Big Data Breakfast Briefing: Herb Cunitz, Hortonworks
 
62_Tazeen_Sayed_Hadoop_Ecosystem.pptx
62_Tazeen_Sayed_Hadoop_Ecosystem.pptx62_Tazeen_Sayed_Hadoop_Ecosystem.pptx
62_Tazeen_Sayed_Hadoop_Ecosystem.pptx
 
HDFS
HDFSHDFS
HDFS
 
Big Data in the Microsoft Platform
Big Data in the Microsoft PlatformBig Data in the Microsoft Platform
Big Data in the Microsoft Platform
 
201305 hadoop jpl-v3
201305 hadoop jpl-v3201305 hadoop jpl-v3
201305 hadoop jpl-v3
 
The Forrester Wave Enterprise Hadoop Solutions Q1 2012
The Forrester Wave Enterprise Hadoop Solutions Q1 2012The Forrester Wave Enterprise Hadoop Solutions Q1 2012
The Forrester Wave Enterprise Hadoop Solutions Q1 2012
 
Hadoop online training
Hadoop online trainingHadoop online training
Hadoop online training
 
Hadoop Seminar Report
Hadoop Seminar ReportHadoop Seminar Report
Hadoop Seminar Report
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 

Introducing the hadoop ecosystem

  • 2. Context: Performance Gap Trend Introduction to the Hadoop Ecosystem 2
  • 3. Context: Exponential for Decades Abundance of - computing & storage - generated data (estimated 8ZB in ’15) - things More data provides greater value Traditional data doesn’t scale well It’s time for a new approach! Introduction to the Hadoop Ecosystem 3
  • 4. New Hardware Approach Traditional Big Data Exotic HW Commodity HW - big central servers -racks of pizza boxes - SAN -Ethernet - RAID -JBOD Hardware reliability Unreliable HW Scales further Limited scalability Cost effective Expensive Introduction to the Hadoop Ecosystem 4
  • 5. New Software Approach Traditional Big Data Monolotic Distributed - Centralized -storage & compute nodes - RDBMS Raw data Schema first Open source Proprietary Introduction to the Hadoop Ecosystem 5
  • 6. Hadoop De facto big data industry standard (batch) Vendor adoption - IBM, Microsoft, Oracle, EMC, ... A collection of projects at Apache - HDFS, MapReduce, Hive, Pig, Hbase, Flume, Oozie, ... Main components - HDFS - MapReduce Cluster Set of machines running HDFS and MapReduce Introduction to the Hadoop Ecosystem 6
  • 7. HDFS Introduction to the Hadoop Ecosystem 7
  • 8. MapReduce Introduction to the Hadoop Ecosystem 8
  • 9. MapReduce Introduction to the Hadoop Ecosystem 9
  • 10. MapReduce Introduction to the Hadoop Ecosystem 10
  • 11. Typical Adoption Pattern An idea that’s impractical without Hadoop Build Hadoop-based POC Move initial application to production Add more datasets and users - removing data silos in organizations - permitting easy experiments on real data Snowballs into institution’s central repository for - analysis data processing data service layer Introduction to the Hadoop Ecosystem 11
  • 12. Use Case 1: Truvo Introduction to the Hadoop Ecosystem 12
  • 13. Use Case 2: UZ Brussel Introduction to the Hadoop Ecosystem 13
  • 14. How can you use Hadoop? What data are you ignoring? - How can you use it? How can you combine internal and external data? - Business partners - Feedback from you customers through social media - End your data silos - ... Introduction to the Hadoop Ecosystem 14
  • 15. DataCrunchers - Big Data Enablers Introduction to the Hadoop Ecosystem 15
  • 16. Introduction to the Hadoop Ecosystem 16