SlideShare a Scribd company logo
1 of 10
Apache Hadoop
Now, Next, and Beyond

Shaun Connolly
VP Corporate Strategy, Hortonworks

April 19, 2012




© Hortonworks Inc. 2012
Big Data: Transactions + Interactions + Observations
                                                                          BIG DATA
                      User Generated Content                                                     Sensors / RFID / Devices

Petabytes                                        Mobile Web                         Social Interactions & Feeds
                                                                              Sentiment
                        User Click Stream                                                        Spatial & GPS Coordinates


                             Web logs                Web             A/B testing               External Demographics

 Terabytes
                      Offer history                         Dynamic Pricing                      Business Data Feeds

                                                                       Affiliate Networks
                                                                                                   HD Video, Audio, Images
                                       CRM     Segmentation
 Gigabytes                                                              Search Marketing
                                                  Offer details                                          Speech to Text
                         ERP                 Customer Touches         Behavioral Targeting
                                                                                              Product/Service Logs
                  Purchase detail            Support Contacts
 Megabytes        Purchase record                                      Dynamic Funnels                     SMS/MMS
                  Payment record




                                        Increasing data variety and complexity

                                                                                                                     Page 2
             © Hortonworks Inc. 2012
What is Apache Hadoop?


• Collection of Open Source Projects           One of the best examples of
   – Apache Software Foundation (ASF)         open source driving innovation
   – Loosely coupled, ship early/often           and creating a market




                                         • Solution for big data
                                            – Stores petabytes of data reliably
                                            – Runs highly distributed applications
                                            – Enables a rational economics model
                                            – Powers data-driven business



                                                                           Page 3
        © Hortonworks Inc. 2012
Key Hadoop Stack Components
                                                                      Core Components                                 Extended Components



                                                                              Pig                          Hive                Ambari &
                                             (Columnar NoSQL Store)




                                                                           (Data Flow)            (SQL-like Access)     Other Monitoring & Management
                                     HBase
            (Cluster Coordination)




                                                                                    MapReduce                                    Oozie &
Zookeeper




                                                                            (Distributed Programing Framework)            Other Workflow Scheduling




                                                                                         HCatalog                               Sqoop &
                                                                               (Table & Schema Management)                  Other Ingest, ETL tools



                                                                                   HDFS                                        Mahout &
                                                                        (Hadoop Distributed File System)                        Other Libraries




                                                                                                                                                        Page 4
                                                       © Hortonworks Inc. 2012
Hadoop Now, Next, and Beyond
  Apache community, including Hortonworks investing to improve Hadoop:
  • Make Hadoop an open, extensible, and enterprise viable platform
  • Enable more applications to run on Apache Hadoop
                                                             “Hadoop.Beyond”
                                                            Integrate w/ecosystem
                                      “Hadoop.Next”
                                        (Hadoop 0.23)
                                           HDP 2

  “Hadoop.Now”                       Next-gen HDFS & MapReduce
     (Hadoop 1.0)
        HDP 1
Most stable Hadoop ever




                                                                               Page 5
           © Hortonworks Inc. 2012
Unifying Classic & Big Data Methods

                                            Classic Method
                                        Structured & Repeatable Analysis




Business determines what                                                      IT structures the data to
    questions to ask                                                          answer those questions
                                      SQL Performance and Structure
                                                                               “Capture only
                                                                               what’s needed”
“Capture in case it’s
     needed”                         MapReduce Processing Flexibility




 IT delivers a platform for              Big Data Method
   storing, refining, and                                                    Business explores data for
                                     Multi-structured & Iterative Analysis   questions worth answering
analyzing all data sources



                                                                                                    Page 6
           © Hortonworks Inc. 2012
Unified Big Data Architecture
Enable Developers, Data Scientists, & Information Workers




      Java, C/C++, Pig, JavaScript, Python, R, SAS, SQL, Excel, BI Tools, Reporting, etc.




                        Capture, Store, Refine, Discover, Analyze, Report, Retain

  •   Fast data loading                      •   Path & pattern analysis       •   Operational analysis
  •   ELT/ETL and refinement                 •   Graph analysis                •   Transactional analysis
  •   Image/video analysis                   •   Text analysis                 •   High volume ad-hoc
  •   Online retention                       •   Iterative discovery           •   Elastic data marts

                          Batch                         Interactive                  Active

    Audio,
                 Docs &            Machine   Coords &       Social    Web &
   Video &                                                                         CRM        SCM    ERP
                  Text              Logs     Sensors       Content    Mobile
   Images


                                                                                                            Page 7
         © Hortonworks Inc. 2012
Hortonworks Vision


   We believe that by the end of 2015,
   more than half the world's data will
   be processed by Apache Hadoop.


                       Q: How to achieve that vision???
                       A: Ecosystem enablement around enterprise-
                             viable open source data platform

                                                             Page 8
     © Hortonworks Inc. 2012
•   2-day event (June 13-14, 2012) in San Jose, CA
•   84 breakout sessions
•   Showcasing real-world examples, developments and
    best practices of Apache Hadoop
•   Plus, Geoffrey Moore to keynote and more to be
    announced
•   Register now at: http://www.hadoopsummit.org

                                                     Page 9
June 13-14, 2012
San Jose, CA

More Related Content

What's hot

Tackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integrationTackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integrationDataWorks Summit
 
Impact of in-memory technology and SAP HANA (2012 Update)
Impact of in-memory technology and SAP HANA (2012 Update)Impact of in-memory technology and SAP HANA (2012 Update)
Impact of in-memory technology and SAP HANA (2012 Update)Vitaliy Rudnytskiy
 
Hadoop for shanghai dev meetup
Hadoop for shanghai dev meetupHadoop for shanghai dev meetup
Hadoop for shanghai dev meetupRoby Chen
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranJAX London
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinerySteve Loughran
 
Big Data launch Singapore Patrick Buddenbaum
Big Data launch Singapore Patrick BuddenbaumBig Data launch Singapore Patrick Buddenbaum
Big Data launch Singapore Patrick BuddenbaumIntelAPAC
 
Exploring Data with Jaspersoft
Exploring Data with JaspersoftExploring Data with Jaspersoft
Exploring Data with JaspersoftMike Boyarski
 
Big Data launch keynote Singapore Patrick Buddenbaum
Big Data launch keynote Singapore Patrick BuddenbaumBig Data launch keynote Singapore Patrick Buddenbaum
Big Data launch keynote Singapore Patrick BuddenbaumIntelAPAC
 
Analytics on Hadoop
Analytics on HadoopAnalytics on Hadoop
Analytics on HadoopEMC
 
Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...
Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...
Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...Cloudera, Inc.
 
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Cana Ko
 
Introducing Jaspersoft 5
Introducing Jaspersoft 5Introducing Jaspersoft 5
Introducing Jaspersoft 5Mike Boyarski
 
Embedded Analytics in your App Webinar
Embedded Analytics in your App WebinarEmbedded Analytics in your App Webinar
Embedded Analytics in your App WebinarMike Boyarski
 
Evaluating jaspersoft community & commercial editions
Evaluating jaspersoft community & commercial editionsEvaluating jaspersoft community & commercial editions
Evaluating jaspersoft community & commercial editionsMike Boyarski
 
Jaspersoft Dashboards Webinar Feb 2013
Jaspersoft Dashboards Webinar  Feb 2013Jaspersoft Dashboards Webinar  Feb 2013
Jaspersoft Dashboards Webinar Feb 2013Mike Boyarski
 
A unified data modeler in the world of big data
A unified data modeler in the world of big dataA unified data modeler in the world of big data
A unified data modeler in the world of big dataWilliam Luk
 
Sap sap so h 2013
Sap sap so h 2013Sap sap so h 2013
Sap sap so h 2013deepersnet
 
Microsoft SQL Azure - Cloud Based Database Datasheet
Microsoft SQL Azure - Cloud Based Database DatasheetMicrosoft SQL Azure - Cloud Based Database Datasheet
Microsoft SQL Azure - Cloud Based Database DatasheetMicrosoft Private Cloud
 
HugeTable:Application-Oriented Structure Data Storage System
HugeTable:Application-Oriented Structure Data Storage SystemHugeTable:Application-Oriented Structure Data Storage System
HugeTable:Application-Oriented Structure Data Storage Systemqlw5
 
Why Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop WebinarWhy Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop WebinarCloudera, Inc.
 

What's hot (20)

Tackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integrationTackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integration
 
Impact of in-memory technology and SAP HANA (2012 Update)
Impact of in-memory technology and SAP HANA (2012 Update)Impact of in-memory technology and SAP HANA (2012 Update)
Impact of in-memory technology and SAP HANA (2012 Update)
 
Hadoop for shanghai dev meetup
Hadoop for shanghai dev meetupHadoop for shanghai dev meetup
Hadoop for shanghai dev meetup
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve Loughran
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinery
 
Big Data launch Singapore Patrick Buddenbaum
Big Data launch Singapore Patrick BuddenbaumBig Data launch Singapore Patrick Buddenbaum
Big Data launch Singapore Patrick Buddenbaum
 
Exploring Data with Jaspersoft
Exploring Data with JaspersoftExploring Data with Jaspersoft
Exploring Data with Jaspersoft
 
Big Data launch keynote Singapore Patrick Buddenbaum
Big Data launch keynote Singapore Patrick BuddenbaumBig Data launch keynote Singapore Patrick Buddenbaum
Big Data launch keynote Singapore Patrick Buddenbaum
 
Analytics on Hadoop
Analytics on HadoopAnalytics on Hadoop
Analytics on Hadoop
 
Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...
Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...
Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...
 
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
 
Introducing Jaspersoft 5
Introducing Jaspersoft 5Introducing Jaspersoft 5
Introducing Jaspersoft 5
 
Embedded Analytics in your App Webinar
Embedded Analytics in your App WebinarEmbedded Analytics in your App Webinar
Embedded Analytics in your App Webinar
 
Evaluating jaspersoft community & commercial editions
Evaluating jaspersoft community & commercial editionsEvaluating jaspersoft community & commercial editions
Evaluating jaspersoft community & commercial editions
 
Jaspersoft Dashboards Webinar Feb 2013
Jaspersoft Dashboards Webinar  Feb 2013Jaspersoft Dashboards Webinar  Feb 2013
Jaspersoft Dashboards Webinar Feb 2013
 
A unified data modeler in the world of big data
A unified data modeler in the world of big dataA unified data modeler in the world of big data
A unified data modeler in the world of big data
 
Sap sap so h 2013
Sap sap so h 2013Sap sap so h 2013
Sap sap so h 2013
 
Microsoft SQL Azure - Cloud Based Database Datasheet
Microsoft SQL Azure - Cloud Based Database DatasheetMicrosoft SQL Azure - Cloud Based Database Datasheet
Microsoft SQL Azure - Cloud Based Database Datasheet
 
HugeTable:Application-Oriented Structure Data Storage System
HugeTable:Application-Oriented Structure Data Storage SystemHugeTable:Application-Oriented Structure Data Storage System
HugeTable:Application-Oriented Structure Data Storage System
 
Why Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop WebinarWhy Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop Webinar
 

Similar to Hadoop - Now, Next and Beyond

Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptxHortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptxHortonworks
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsHortonworks
 
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, ParisHadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, ParisOW2
 
Apache Hadoop Now Next and Beyond
Apache Hadoop Now Next and BeyondApache Hadoop Now Next and Beyond
Apache Hadoop Now Next and BeyondDataWorks Summit
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsHortonworks
 
Hw09 Data Processing In The Enterprise
Hw09   Data Processing In The EnterpriseHw09   Data Processing In The Enterprise
Hw09 Data Processing In The EnterpriseCloudera, Inc.
 
Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...
Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...
Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...Big Data Spain
 
Keynote from ApacheCon NA 2011
Keynote from ApacheCon NA 2011Keynote from ApacheCon NA 2011
Keynote from ApacheCon NA 2011Hortonworks
 
Why hadoop for data science?
Why hadoop for data science?Why hadoop for data science?
Why hadoop for data science?Hortonworks
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingm_hepburn
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshowAccenture
 
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesWebinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesCloudera, Inc.
 
Introduction to Microsoft HDInsight and BI Tools
Introduction to Microsoft HDInsight and BI ToolsIntroduction to Microsoft HDInsight and BI Tools
Introduction to Microsoft HDInsight and BI ToolsDataWorks Summit
 
The Forrester Wave Enterprise Hadoop Solutions Q1 2012
The Forrester Wave Enterprise Hadoop Solutions Q1 2012The Forrester Wave Enterprise Hadoop Solutions Q1 2012
The Forrester Wave Enterprise Hadoop Solutions Q1 2012m_hepburn
 
Talend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data PlatformTalend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data PlatformHortonworks
 
HDP-1 introduction for HUG France
HDP-1 introduction for HUG FranceHDP-1 introduction for HUG France
HDP-1 introduction for HUG FranceSteve Loughran
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Cloudera, Inc.
 

Similar to Hadoop - Now, Next and Beyond (20)

Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptxHortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
 
Hadoop Trends
Hadoop TrendsHadoop Trends
Hadoop Trends
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for Windows
 
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, ParisHadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
 
Apache Hadoop Now Next and Beyond
Apache Hadoop Now Next and BeyondApache Hadoop Now Next and Beyond
Apache Hadoop Now Next and Beyond
 
Cloud computing era
Cloud computing eraCloud computing era
Cloud computing era
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data Analytics
 
Zh tw cloud computing era
Zh tw cloud computing eraZh tw cloud computing era
Zh tw cloud computing era
 
Hw09 Data Processing In The Enterprise
Hw09   Data Processing In The EnterpriseHw09   Data Processing In The Enterprise
Hw09 Data Processing In The Enterprise
 
Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...
Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...
Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...
 
Keynote from ApacheCon NA 2011
Keynote from ApacheCon NA 2011Keynote from ApacheCon NA 2011
Keynote from ApacheCon NA 2011
 
Why hadoop for data science?
Why hadoop for data science?Why hadoop for data science?
Why hadoop for data science?
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-banking
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshow
 
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesWebinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
 
Introduction to Microsoft HDInsight and BI Tools
Introduction to Microsoft HDInsight and BI ToolsIntroduction to Microsoft HDInsight and BI Tools
Introduction to Microsoft HDInsight and BI Tools
 
The Forrester Wave Enterprise Hadoop Solutions Q1 2012
The Forrester Wave Enterprise Hadoop Solutions Q1 2012The Forrester Wave Enterprise Hadoop Solutions Q1 2012
The Forrester Wave Enterprise Hadoop Solutions Q1 2012
 
Talend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data PlatformTalend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data Platform
 
HDP-1 introduction for HUG France
HDP-1 introduction for HUG FranceHDP-1 introduction for HUG France
HDP-1 introduction for HUG France
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
 

More from Teradata Aster

Razorfish Multi-Channel Marketing: Better Customer Segmentation and Targeting
Razorfish Multi-Channel Marketing: Better Customer Segmentation and TargetingRazorfish Multi-Channel Marketing: Better Customer Segmentation and Targeting
Razorfish Multi-Channel Marketing: Better Customer Segmentation and TargetingTeradata Aster
 
Big Data Decision-Making
Big Data Decision-MakingBig Data Decision-Making
Big Data Decision-MakingTeradata Aster
 
Using Data to Manage in Today’s Chaotic Environment
Using Data to Manage in Today’s Chaotic EnvironmentUsing Data to Manage in Today’s Chaotic Environment
Using Data to Manage in Today’s Chaotic EnvironmentTeradata Aster
 
Big Analytics 2012 Event Survey Data
Big Analytics 2012 Event Survey DataBig Analytics 2012 Event Survey Data
Big Analytics 2012 Event Survey DataTeradata Aster
 
What Makes A Great Data Scientist?
What Makes A Great Data Scientist?What Makes A Great Data Scientist?
What Makes A Great Data Scientist?Teradata Aster
 
Practical Applications of Visual Analytics
Practical Applications of Visual AnalyticsPractical Applications of Visual Analytics
Practical Applications of Visual AnalyticsTeradata Aster
 
Trust and Influence in the Complex Network of Social Media
Trust and Influence in the Complex Network of Social MediaTrust and Influence in the Complex Network of Social Media
Trust and Influence in the Complex Network of Social MediaTeradata Aster
 
Turning Big Data to Business Advantage
Turning Big Data to Business AdvantageTurning Big Data to Business Advantage
Turning Big Data to Business AdvantageTeradata Aster
 
Big Brands Meet Big Data – The Newest Innovator’s Dilemma
Big Brands Meet Big Data – The Newest Innovator’s DilemmaBig Brands Meet Big Data – The Newest Innovator’s Dilemma
Big Brands Meet Big Data – The Newest Innovator’s DilemmaTeradata Aster
 
Simplifying Big Data Analytics for the Business
Simplifying Big Data Analytics for the BusinessSimplifying Big Data Analytics for the Business
Simplifying Big Data Analytics for the BusinessTeradata Aster
 
Evaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics PlatformsEvaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics PlatformsTeradata Aster
 
Keynote: Cross Industry Lessons from Moneyball Analytics
Keynote: Cross Industry Lessons from Moneyball AnalyticsKeynote: Cross Industry Lessons from Moneyball Analytics
Keynote: Cross Industry Lessons from Moneyball AnalyticsTeradata Aster
 
Technology Strategies for Big Data Analytics,
Technology Strategies for Big Data Analytics, Technology Strategies for Big Data Analytics,
Technology Strategies for Big Data Analytics, Teradata Aster
 
From Data Science to Business Value - Analytics Applied
From Data Science to Business Value - Analytics AppliedFrom Data Science to Business Value - Analytics Applied
From Data Science to Business Value - Analytics AppliedTeradata Aster
 
Solving the Education Crisis with Big Data
Solving the Education Crisis with Big DataSolving the Education Crisis with Big Data
Solving the Education Crisis with Big DataTeradata Aster
 
Using SQL-MapReduce for Advanced Analytics
Using SQL-MapReduce for Advanced AnalyticsUsing SQL-MapReduce for Advanced Analytics
Using SQL-MapReduce for Advanced AnalyticsTeradata Aster
 
SAS aster data big data dc presentation public
SAS aster data big data dc presentation publicSAS aster data big data dc presentation public
SAS aster data big data dc presentation publicTeradata Aster
 
Utilizing Aster nCluster to support processing in excess of 100 Billion rows ...
Utilizing Aster nCluster to support processing in excess of 100 Billion rows ...Utilizing Aster nCluster to support processing in excess of 100 Billion rows ...
Utilizing Aster nCluster to support processing in excess of 100 Billion rows ...Teradata Aster
 
20100506 aster data big data summit - microstrategy (shareable)
20100506   aster data big data summit - microstrategy (shareable)20100506   aster data big data summit - microstrategy (shareable)
20100506 aster data big data summit - microstrategy (shareable)Teradata Aster
 

More from Teradata Aster (20)

Razorfish Multi-Channel Marketing: Better Customer Segmentation and Targeting
Razorfish Multi-Channel Marketing: Better Customer Segmentation and TargetingRazorfish Multi-Channel Marketing: Better Customer Segmentation and Targeting
Razorfish Multi-Channel Marketing: Better Customer Segmentation and Targeting
 
Big Data Decision-Making
Big Data Decision-MakingBig Data Decision-Making
Big Data Decision-Making
 
Using Data to Manage in Today’s Chaotic Environment
Using Data to Manage in Today’s Chaotic EnvironmentUsing Data to Manage in Today’s Chaotic Environment
Using Data to Manage in Today’s Chaotic Environment
 
Big Analytics 2012 Event Survey Data
Big Analytics 2012 Event Survey DataBig Analytics 2012 Event Survey Data
Big Analytics 2012 Event Survey Data
 
What Makes A Great Data Scientist?
What Makes A Great Data Scientist?What Makes A Great Data Scientist?
What Makes A Great Data Scientist?
 
Practical Applications of Visual Analytics
Practical Applications of Visual AnalyticsPractical Applications of Visual Analytics
Practical Applications of Visual Analytics
 
Trust and Influence in the Complex Network of Social Media
Trust and Influence in the Complex Network of Social MediaTrust and Influence in the Complex Network of Social Media
Trust and Influence in the Complex Network of Social Media
 
Turning Big Data to Business Advantage
Turning Big Data to Business AdvantageTurning Big Data to Business Advantage
Turning Big Data to Business Advantage
 
Big Brands Meet Big Data – The Newest Innovator’s Dilemma
Big Brands Meet Big Data – The Newest Innovator’s DilemmaBig Brands Meet Big Data – The Newest Innovator’s Dilemma
Big Brands Meet Big Data – The Newest Innovator’s Dilemma
 
Simplifying Big Data Analytics for the Business
Simplifying Big Data Analytics for the BusinessSimplifying Big Data Analytics for the Business
Simplifying Big Data Analytics for the Business
 
Evaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics PlatformsEvaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics Platforms
 
Keynote: Cross Industry Lessons from Moneyball Analytics
Keynote: Cross Industry Lessons from Moneyball AnalyticsKeynote: Cross Industry Lessons from Moneyball Analytics
Keynote: Cross Industry Lessons from Moneyball Analytics
 
Technology Strategies for Big Data Analytics,
Technology Strategies for Big Data Analytics, Technology Strategies for Big Data Analytics,
Technology Strategies for Big Data Analytics,
 
From Data Science to Business Value - Analytics Applied
From Data Science to Business Value - Analytics AppliedFrom Data Science to Business Value - Analytics Applied
From Data Science to Business Value - Analytics Applied
 
Solving the Education Crisis with Big Data
Solving the Education Crisis with Big DataSolving the Education Crisis with Big Data
Solving the Education Crisis with Big Data
 
Using SQL-MapReduce for Advanced Analytics
Using SQL-MapReduce for Advanced AnalyticsUsing SQL-MapReduce for Advanced Analytics
Using SQL-MapReduce for Advanced Analytics
 
SAS aster data big data dc presentation public
SAS aster data big data dc presentation publicSAS aster data big data dc presentation public
SAS aster data big data dc presentation public
 
Utilizing Aster nCluster to support processing in excess of 100 Billion rows ...
Utilizing Aster nCluster to support processing in excess of 100 Billion rows ...Utilizing Aster nCluster to support processing in excess of 100 Billion rows ...
Utilizing Aster nCluster to support processing in excess of 100 Billion rows ...
 
comScore
comScorecomScore
comScore
 
20100506 aster data big data summit - microstrategy (shareable)
20100506   aster data big data summit - microstrategy (shareable)20100506   aster data big data summit - microstrategy (shareable)
20100506 aster data big data summit - microstrategy (shareable)
 

Recently uploaded

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 

Recently uploaded (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 

Hadoop - Now, Next and Beyond

  • 1. Apache Hadoop Now, Next, and Beyond Shaun Connolly VP Corporate Strategy, Hortonworks April 19, 2012 © Hortonworks Inc. 2012
  • 2. Big Data: Transactions + Interactions + Observations BIG DATA User Generated Content Sensors / RFID / Devices Petabytes Mobile Web Social Interactions & Feeds Sentiment User Click Stream Spatial & GPS Coordinates Web logs Web A/B testing External Demographics Terabytes Offer history Dynamic Pricing Business Data Feeds Affiliate Networks HD Video, Audio, Images CRM Segmentation Gigabytes Search Marketing Offer details Speech to Text ERP Customer Touches Behavioral Targeting Product/Service Logs Purchase detail Support Contacts Megabytes Purchase record Dynamic Funnels SMS/MMS Payment record Increasing data variety and complexity Page 2 © Hortonworks Inc. 2012
  • 3. What is Apache Hadoop? • Collection of Open Source Projects One of the best examples of – Apache Software Foundation (ASF) open source driving innovation – Loosely coupled, ship early/often and creating a market • Solution for big data – Stores petabytes of data reliably – Runs highly distributed applications – Enables a rational economics model – Powers data-driven business Page 3 © Hortonworks Inc. 2012
  • 4. Key Hadoop Stack Components Core Components Extended Components Pig Hive Ambari & (Columnar NoSQL Store) (Data Flow) (SQL-like Access) Other Monitoring & Management HBase (Cluster Coordination) MapReduce Oozie & Zookeeper (Distributed Programing Framework) Other Workflow Scheduling HCatalog Sqoop & (Table & Schema Management) Other Ingest, ETL tools HDFS Mahout & (Hadoop Distributed File System) Other Libraries Page 4 © Hortonworks Inc. 2012
  • 5. Hadoop Now, Next, and Beyond Apache community, including Hortonworks investing to improve Hadoop: • Make Hadoop an open, extensible, and enterprise viable platform • Enable more applications to run on Apache Hadoop “Hadoop.Beyond” Integrate w/ecosystem “Hadoop.Next” (Hadoop 0.23) HDP 2 “Hadoop.Now” Next-gen HDFS & MapReduce (Hadoop 1.0) HDP 1 Most stable Hadoop ever Page 5 © Hortonworks Inc. 2012
  • 6. Unifying Classic & Big Data Methods Classic Method Structured & Repeatable Analysis Business determines what IT structures the data to questions to ask answer those questions SQL Performance and Structure “Capture only what’s needed” “Capture in case it’s needed” MapReduce Processing Flexibility IT delivers a platform for Big Data Method storing, refining, and Business explores data for Multi-structured & Iterative Analysis questions worth answering analyzing all data sources Page 6 © Hortonworks Inc. 2012
  • 7. Unified Big Data Architecture Enable Developers, Data Scientists, & Information Workers Java, C/C++, Pig, JavaScript, Python, R, SAS, SQL, Excel, BI Tools, Reporting, etc. Capture, Store, Refine, Discover, Analyze, Report, Retain • Fast data loading • Path & pattern analysis • Operational analysis • ELT/ETL and refinement • Graph analysis • Transactional analysis • Image/video analysis • Text analysis • High volume ad-hoc • Online retention • Iterative discovery • Elastic data marts Batch Interactive Active Audio, Docs & Machine Coords & Social Web & Video & CRM SCM ERP Text Logs Sensors Content Mobile Images Page 7 © Hortonworks Inc. 2012
  • 8. Hortonworks Vision We believe that by the end of 2015, more than half the world's data will be processed by Apache Hadoop. Q: How to achieve that vision??? A: Ecosystem enablement around enterprise- viable open source data platform Page 8 © Hortonworks Inc. 2012
  • 9. 2-day event (June 13-14, 2012) in San Jose, CA • 84 breakout sessions • Showcasing real-world examples, developments and best practices of Apache Hadoop • Plus, Geoffrey Moore to keynote and more to be announced • Register now at: http://www.hadoopsummit.org Page 9