SlideShare a Scribd company logo
1 of 10
The State of the Apache
  Hadoop Ecosystem

         Doug Cutting
      Cloudera & Apache
Outline
● the ecosystem
    ○   why we need it
    ○   what it is
    ○   why its strong
    ○   how it can evolve
●   highlights
    ○ current
    ○ next
●   wrap up
Why are we here?

Hardware has improved
  ● exponentially for decades
  ● both storage and compute

We can now store and process much more!
  ○ yet have been slow to leverage


Analyzing more data makes us smarter.
  ○ Norvig's Unreasonable Effectiveness of Data
The Ecosystem is the System
● Hadoop has become the kernel
  ○ of the distributed operating system for Big Data
  ○ a de-facto industry standard


● No one uses the kernel alone

● A collection of projects at Apache
Strengths of Apache
Mandates diversity & transparency
  ○ you control your fate

Insures against vendor lock-in
   ○ can't buy the ASF

Allows competing projects
    ○ survival of the fittest

Ecosystem as loose federation
   ○ lets platform evolve
What's new?
● Apache Hadoop 0.20.205
    ○ append
    ○ security


●   CDH3
    ○ Mahout included
    ○ Avro support across components
What's next?
● Apache Hadoop 0.23
   ○ HDFS
     ■ performance
     ■ scalability (federation)
     ■ availability (HA)
   ○ MR2


● CDH4
   ○ includes Hadoop 0.23
   ○ BigTop-based


● S4, Giraph, Crunch, Blur, ...
Apache BigTop (incubating)
Ecosystem as a project
  ○   integration tests       Includes:
  ○   compatible versions     ●   Hadoop
  ○   common packaging        ●   HBase
  ○   release is a set        ●   Zookeeper
                              ●   Avro
                              ●   Hive
Basis for CDH                 ●   Pig
  ○ like Fedora is for RHEL   ●   Oozie
                              ●   Flume
                              ●   Mahout
Community driven              ●   ...
Join the community
Hadoop and Big Data are still young.
  Hardware trends will continue.

Hadoop started with just two developers.
  Now it has hundreds.
  You can be the next.
  What do you need?
Thanks!
Questions?

More Related Content

Viewers also liked

SLIDES DA SITUAÇÃO DE APRENDIZAGEM 4 - 2º ANO VOL.1
SLIDES DA SITUAÇÃO DE APRENDIZAGEM 4 - 2º ANO VOL.1SLIDES DA SITUAÇÃO DE APRENDIZAGEM 4 - 2º ANO VOL.1
SLIDES DA SITUAÇÃO DE APRENDIZAGEM 4 - 2º ANO VOL.1
Tiago Rafael
 
Legal Hold and Data Preservation Best Practices
Legal Hold and Data Preservation Best PracticesLegal Hold and Data Preservation Best Practices
Legal Hold and Data Preservation Best Practices
Zapproved
 
Docker Based Hadoop Provisioning
Docker Based Hadoop ProvisioningDocker Based Hadoop Provisioning
Docker Based Hadoop Provisioning
DataWorks Summit
 

Viewers also liked (13)

SLIDES DA SITUAÇÃO DE APRENDIZAGEM 4 - 2º ANO VOL.1
SLIDES DA SITUAÇÃO DE APRENDIZAGEM 4 - 2º ANO VOL.1SLIDES DA SITUAÇÃO DE APRENDIZAGEM 4 - 2º ANO VOL.1
SLIDES DA SITUAÇÃO DE APRENDIZAGEM 4 - 2º ANO VOL.1
 
Legal Hold and Data Preservation Best Practices
Legal Hold and Data Preservation Best PracticesLegal Hold and Data Preservation Best Practices
Legal Hold and Data Preservation Best Practices
 
Unik Slides
Unik SlidesUnik Slides
Unik Slides
 
4 infatec02
4 infatec024 infatec02
4 infatec02
 
El uso de la tecnología para aumentar el aprovechamiento académico en las cie...
El uso de la tecnología para aumentar el aprovechamiento académico en las cie...El uso de la tecnología para aumentar el aprovechamiento académico en las cie...
El uso de la tecnología para aumentar el aprovechamiento académico en las cie...
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
 
BLAST(Basic Local Alignment Tool)
BLAST(Basic Local Alignment Tool)BLAST(Basic Local Alignment Tool)
BLAST(Basic Local Alignment Tool)
 
4 infatec03
4 infatec034 infatec03
4 infatec03
 
Docker Based Hadoop Provisioning
Docker Based Hadoop ProvisioningDocker Based Hadoop Provisioning
Docker Based Hadoop Provisioning
 
4 infatec06
4 infatec064 infatec06
4 infatec06
 
Mutação gênica
Mutação gênicaMutação gênica
Mutação gênica
 
Desigualdade de gênero
Desigualdade de gêneroDesigualdade de gênero
Desigualdade de gênero
 
Sceneries
SceneriesSceneries
Sceneries
 

More from Cloudera, Inc.

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

Hadoop World 2011 Keynote: The State of the Apache Hadoop Ecosystem

  • 1. The State of the Apache Hadoop Ecosystem Doug Cutting Cloudera & Apache
  • 2. Outline ● the ecosystem ○ why we need it ○ what it is ○ why its strong ○ how it can evolve ● highlights ○ current ○ next ● wrap up
  • 3. Why are we here? Hardware has improved ● exponentially for decades ● both storage and compute We can now store and process much more! ○ yet have been slow to leverage Analyzing more data makes us smarter. ○ Norvig's Unreasonable Effectiveness of Data
  • 4. The Ecosystem is the System ● Hadoop has become the kernel ○ of the distributed operating system for Big Data ○ a de-facto industry standard ● No one uses the kernel alone ● A collection of projects at Apache
  • 5. Strengths of Apache Mandates diversity & transparency ○ you control your fate Insures against vendor lock-in ○ can't buy the ASF Allows competing projects ○ survival of the fittest Ecosystem as loose federation ○ lets platform evolve
  • 6. What's new? ● Apache Hadoop 0.20.205 ○ append ○ security ● CDH3 ○ Mahout included ○ Avro support across components
  • 7. What's next? ● Apache Hadoop 0.23 ○ HDFS ■ performance ■ scalability (federation) ■ availability (HA) ○ MR2 ● CDH4 ○ includes Hadoop 0.23 ○ BigTop-based ● S4, Giraph, Crunch, Blur, ...
  • 8. Apache BigTop (incubating) Ecosystem as a project ○ integration tests Includes: ○ compatible versions ● Hadoop ○ common packaging ● HBase ○ release is a set ● Zookeeper ● Avro ● Hive Basis for CDH ● Pig ○ like Fedora is for RHEL ● Oozie ● Flume ● Mahout Community driven ● ...
  • 9. Join the community Hadoop and Big Data are still young. Hardware trends will continue. Hadoop started with just two developers. Now it has hundreds. You can be the next. What do you need?