SlideShare uma empresa Scribd logo
1 de 40
BIG DATA = BIG DECISIONS




       Bob Zurek | SVP Products | Epsilon | www.epsilon.com
BIG DATA APPROACHING
Consider the following:
•   New model for data
•   Accessible over TCP/IP and variety of languages
•   Initially difficult to understand
•   Capable of processing thousands of ops/sec
•   Very different from old model
•   Threatening as much was invested in old model
•   Changing course seems ridiculous




                                                      Source: Eben Hewitt
What are we talking about?
IBM IMS




    “IMS is IBM's premier transaction and hierarchical database
    management system, virtually unsurpassed in database and
    transaction processing availability and speed” – IBM 2013

    “Mission-critical processing that requires unparalleled
    performance is best served by a hierarchical model. Analytics
    and business intelligence are best served by a relational
    model. Most Fortune 100 companies use both.”


                                                              Source: IBM
Data evolution



                 A New Model Is Invented

                 A Disruptive Model

                 A Threatening Model

                 A Competitive Model

                                Source: Eben Hewitt
The relational model & SQL




            A HUGE industry success
So now what?
We have a problem
innovation           complexity

       confusion
                     a new model
disruption
             fierce competition

   Sound familiar?
Big data – a growing torrent
$600     to buy a disk drive that can
         store all of the world’s music

                                    5 billion           mobile phones
                                                        in use in 2010

   30                     pieces of content shared
                          on Facebook every month

   billion              40%         projected growth in global data

                                     generated per year vs.5%
235   terabytes data collected by the
      U.S. Library of Congress by April 2011
                                                            growth in global
                                                            IT spending


                     15 out of 17
                     sectors in the United States have more data
                     stored per company than the U.S. Library of Congress

                                                               Source: McKinsey
Industry buzz



      What is
      big data,
      exactly?
Big data confusion?

                               What do business executives
                                  think “big data” is?

     A greater scope of information                                 18%
     New kinds of data and analysis                           16%
     Real-time information                                  15%
     Data influx from new technologies                13%
     Non-traditional forms of media                   13%
     Large volumes of data                      10%
     The latest buzzword                   8%
     Social media data                7%


                                                                Source: IBM
Big data is…


               Large pools of data
               that can be captured,
               communicated,
               aggregated, stored,
               and analyzed




                            Source: McKinsey
Another way of looking at it




                               Source: TDWI
Is it time to look
for an alternative?
It’s not that simple,
                is it?
How are we solving (historically)?
•   Vertical scaling = throw hardware at it
•   Optimize the application = sql, indexes, access
•   Employ caching layers = MemcacheD, Coherence
•   Denormalization = reduce joins
•   Sharding/Shared Nothing = split the data up
•   Innovation = columnar
What’s driving
change and
innovation?
102556397
 102556397
Big data innovation incubated
Big data innovation incubated
         A search engine project at Yahoo
               Doug Cutting = Nutch
               Google = GFS and GMR
eBay erected a Hadoop cluster
spanning 530 servers –
now five times the size!




                                         “Hadoop is an amazing
                                         technology stack. We now
                                         depend on it to run eBay.”
                                                                          Bob Page,
                                                    Vice President of Analytics, eBay

                 Source: http://www.wired.com/wiredenterprise/2011/10/how-yahoo-spawned-hadoop/
It can get complex
and confusing


      “It replaced our need
       for ETL”

      “It is great for batch
       processing in parallel”

      “A beautiful platform
       for all of problems”
What it’s not good for

• High volume transactional data
• Structured data with low latency


“Note that Hadoop is not an Extract-Transform-
Load (ETL) tool. It is a platform that supports
running ETL processes in parallel. The data
integration vendors do not compete with
Hadoop; rather, Hadoop is another channel
for use of their data transformation modules. “
                        Teradata/Cloudera Presentation
What it’s really good for

• Index building
• Pattern recognitions
• Sentiment analysis
• Machine generated data
• Log processing
• Web scale = Google, Twitter,
  YouTube
Use Cases
                                                                               Fraud Detection
                                                                                      Spot fraud anomolies
                                              Mobile Data
                                                   Process mobile data
Online Travel Reservations                                                   IT Security
     Travel booking                                                          Analyze machine generated data




      Image Processing                             E-Commerce
                                                       Large marketplaces
        Detecting patterns in sat imagery


                                                                              HealthCare
                      Energy Discovery                                          Semantic analysis for relevance
                       Sort and process seismic data

                                                                   Energy Savings
      Infrastructure Management                                          Suggest ways customers save money
                  Collecting device logs
Source: Teradata/Cloudera
Source: Teradata/Cloudera
Many shades of grey and
lots of great innovations
Relational is still in play
Some innovations worth a look
      Dynamically Scaling OLTP = “No Need To Shard”
The NoSQL generation



  • Document Storage Model     •   Released by NSA to open source
  • Allows MTV to store        •   Apache Accumulo
    hierarchical data          •   Based on Google Big Table
  • Flexible schema to model   •   Built on top of Hadoop
    structure/data by brand    •   Fine-grained access control
  • Needed to have ability     •   Cell level security
    to query nested content    •   Server side programming
  • No need for a shared
    disk storage
Why NoSQL?




       •   Schemaless model = Easy to to add fields
       •   Document oriented = Json format (think objects)
       •   Built from the ground up to be distributed
       •   Auto sharding
       •   Distributed querying capabilities
NoSQL Use Case



                 1. Click/Event into Hadoop

                 2. Data Analyzed via Map Reduce jobs;
                    generates 100M profiles based on
                    campaigns running

                 3. Selected profiles loaded into Couch

                 4. Ad targeting logic query Couch with
                    sub-second latency to optimize
                    decision and real-time ad placement




                                   Source: Couchbase
Hadoop Augmentation
•   Side-by-Side will be commonplace
•   ETL solutions support Hadoop
•   Relational Databases
     • Provide ETL interfaces to Hadoop
     • Execute map/reduce jobs inside DBMS
•   NoSQL supports ETL
Example Hybrid DBMS Systems
Oracle Endeca Server
  •   Hybrid Search/Analytic Database
  •   Supports structured, unstructured, semi-structured
  •   No schema required. Records stacked.
  •   Columnar
Trends
     •   SQL On Hadoop – Hadapt, Clodera Impala, EMC
     •   Unified Support of Structured, Unstructured, Semi
     •   Embedding Search
     •   Expanded ETL/ELT Support
     •   Big Data In Motion Takes Hold
     •   Added Data Mining and Analytic Functions In NoSQL
     •   Embedding R Language = gain in popularity
     •   Data Scientists instrumental in business success
Bob Zurek | bzurek@epsilon.com

Mais conteúdo relacionado

Mais procurados

THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012Gigaom
 
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 Unexpected Challenges in Large Scale Machine Learning by Charles Parker Unexpected Challenges in Large Scale Machine Learning by Charles Parker
Unexpected Challenges in Large Scale Machine Learning by Charles ParkerBigMine
 
Big Data and the Art of Data Science
Big Data and the Art of Data ScienceBig Data and the Art of Data Science
Big Data and the Art of Data ScienceAndrew Gardner
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolutionitnewsafrica
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big DataIndu Khemchandani
 
Big data overview external
Big data overview externalBig data overview external
Big data overview externalBrett Colbert
 
Big Data’s Big Impact on Businesses
Big Data’s Big Impact on BusinessesBig Data’s Big Impact on Businesses
Big Data’s Big Impact on BusinessesCRISIL Limited
 
Big Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage StrategyBig Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage StrategyHitachi Vantara
 
Big data analytics, survey r.nabati
Big data analytics, survey r.nabatiBig data analytics, survey r.nabati
Big data analytics, survey r.nabatinabati
 
Big data - what, why, where, when and how
Big data - what, why, where, when and howBig data - what, why, where, when and how
Big data - what, why, where, when and howbobosenthil
 
Big data, data science & fast data
Big data, data science & fast dataBig data, data science & fast data
Big data, data science & fast dataKunal Joshi
 
Big Data and Computer Science Education
Big Data and Computer Science EducationBig Data and Computer Science Education
Big Data and Computer Science EducationJames Hendler
 
Data Mining and Big Data Challenges and Research Opportunities
Data Mining and Big Data Challenges and Research OpportunitiesData Mining and Big Data Challenges and Research Opportunities
Data Mining and Big Data Challenges and Research OpportunitiesKathirvel Ayyaswamy
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)Ajay Ohri
 

Mais procurados (20)

A Big Data Concept
A Big Data ConceptA Big Data Concept
A Big Data Concept
 
What is big data?
What is big data?What is big data?
What is big data?
 
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
 
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 Unexpected Challenges in Large Scale Machine Learning by Charles Parker Unexpected Challenges in Large Scale Machine Learning by Charles Parker
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 
Sina Sohangir Presentation on IWMC 2015
Sina Sohangir Presentation on IWMC 2015Sina Sohangir Presentation on IWMC 2015
Sina Sohangir Presentation on IWMC 2015
 
Big Data and the Art of Data Science
Big Data and the Art of Data ScienceBig Data and the Art of Data Science
Big Data and the Art of Data Science
 
Motivation for big data
Motivation for big dataMotivation for big data
Motivation for big data
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
 
Big data overview external
Big data overview externalBig data overview external
Big data overview external
 
Chapter 1 big data
Chapter 1 big dataChapter 1 big data
Chapter 1 big data
 
Big Data’s Big Impact on Businesses
Big Data’s Big Impact on BusinessesBig Data’s Big Impact on Businesses
Big Data’s Big Impact on Businesses
 
Big Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage StrategyBig Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage Strategy
 
Big Data Hadoop Training by Easylearning Guru
Big Data Hadoop Training by Easylearning GuruBig Data Hadoop Training by Easylearning Guru
Big Data Hadoop Training by Easylearning Guru
 
Big data analytics, survey r.nabati
Big data analytics, survey r.nabatiBig data analytics, survey r.nabati
Big data analytics, survey r.nabati
 
Big data - what, why, where, when and how
Big data - what, why, where, when and howBig data - what, why, where, when and how
Big data - what, why, where, when and how
 
Big data, data science & fast data
Big data, data science & fast dataBig data, data science & fast data
Big data, data science & fast data
 
Big Data and Computer Science Education
Big Data and Computer Science EducationBig Data and Computer Science Education
Big Data and Computer Science Education
 
Data Mining and Big Data Challenges and Research Opportunities
Data Mining and Big Data Challenges and Research OpportunitiesData Mining and Big Data Challenges and Research Opportunities
Data Mining and Big Data Challenges and Research Opportunities
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
 

Semelhante a Big Data = Big Decisions

Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Mark Heid
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingm_hepburn
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusersBob Hardaway
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptalmaraniabwmalk
 
Hadoop, Big Data, and the Future of the Enterprise Data Warehouse
Hadoop, Big Data, and the Future of the Enterprise Data WarehouseHadoop, Big Data, and the Future of the Enterprise Data Warehouse
Hadoop, Big Data, and the Future of the Enterprise Data Warehousetervela
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureOdinot Stanislas
 
big-data-8722-m8RQ3h1.pptx
big-data-8722-m8RQ3h1.pptxbig-data-8722-m8RQ3h1.pptx
big-data-8722-m8RQ3h1.pptxVaishnavGhadge1
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyRohit Dubey
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataRoi Blanco
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
 
Sample Paper.doc.doc
Sample Paper.doc.docSample Paper.doc.doc
Sample Paper.doc.docbutest
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantStuart Miniman
 
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...DATAVERSITY
 
Big data management
Big data managementBig data management
Big data managementzeba khanam
 

Semelhante a Big Data = Big Decisions (20)

Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-banking
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusers
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
 
Hadoop, Big Data, and the Future of the Enterprise Data Warehouse
Hadoop, Big Data, and the Future of the Enterprise Data WarehouseHadoop, Big Data, and the Future of the Enterprise Data Warehouse
Hadoop, Big Data, and the Future of the Enterprise Data Warehouse
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform Architecture
 
big-data-8722-m8RQ3h1.pptx
big-data-8722-m8RQ3h1.pptxbig-data-8722-m8RQ3h1.pptx
big-data-8722-m8RQ3h1.pptx
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit Dubey
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Big Data ppt
Big Data pptBig Data ppt
Big Data ppt
 
Sample Paper.doc.doc
Sample Paper.doc.docSample Paper.doc.doc
Sample Paper.doc.doc
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You Want
 
Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
 
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data management
Big data managementBig data management
Big data management
 
Big Data
Big DataBig Data
Big Data
 

Mais de InnoTech

"So you want to raise funding and build a team?"
"So you want to raise funding and build a team?""So you want to raise funding and build a team?"
"So you want to raise funding and build a team?"InnoTech
 
Artificial Intelligence is Maturing
Artificial Intelligence is MaturingArtificial Intelligence is Maturing
Artificial Intelligence is MaturingInnoTech
 
What is AI without Data?
What is AI without Data?What is AI without Data?
What is AI without Data?InnoTech
 
Courageous Leadership - When it Matters Most
Courageous Leadership - When it Matters MostCourageous Leadership - When it Matters Most
Courageous Leadership - When it Matters MostInnoTech
 
The Gathering Storm
The Gathering StormThe Gathering Storm
The Gathering StormInnoTech
 
Sql Server tips from the field
Sql Server tips from the fieldSql Server tips from the field
Sql Server tips from the fieldInnoTech
 
Quantum Computing and its security implications
Quantum Computing and its security implicationsQuantum Computing and its security implications
Quantum Computing and its security implicationsInnoTech
 
Converged Infrastructure
Converged InfrastructureConverged Infrastructure
Converged InfrastructureInnoTech
 
Making the most out of collaboration with Office 365
Making the most out of collaboration with Office 365Making the most out of collaboration with Office 365
Making the most out of collaboration with Office 365InnoTech
 
Blockchain use cases and case studies
Blockchain use cases and case studiesBlockchain use cases and case studies
Blockchain use cases and case studiesInnoTech
 
Blockchain: Exploring the Fundamentals and Promising Potential
Blockchain: Exploring the Fundamentals and Promising Potential Blockchain: Exploring the Fundamentals and Promising Potential
Blockchain: Exploring the Fundamentals and Promising Potential InnoTech
 
Business leaders are engaging labor differently - Is your IT ready?
Business leaders are engaging labor differently - Is your IT ready?Business leaders are engaging labor differently - Is your IT ready?
Business leaders are engaging labor differently - Is your IT ready?InnoTech
 
AI 3.0: Is it Finally Time for Artificial Intelligence and Sensor Networks to...
AI 3.0: Is it Finally Time for Artificial Intelligence and Sensor Networks to...AI 3.0: Is it Finally Time for Artificial Intelligence and Sensor Networks to...
AI 3.0: Is it Finally Time for Artificial Intelligence and Sensor Networks to...InnoTech
 
Using Business Intelligence to Bring Your Data to Life
Using Business Intelligence to Bring Your Data to LifeUsing Business Intelligence to Bring Your Data to Life
Using Business Intelligence to Bring Your Data to LifeInnoTech
 
User requirements is a fallacy
User requirements is a fallacyUser requirements is a fallacy
User requirements is a fallacyInnoTech
 
What I Wish I Knew Before I Signed that Contract - San Antonio
What I Wish I Knew Before I Signed that Contract - San Antonio What I Wish I Knew Before I Signed that Contract - San Antonio
What I Wish I Knew Before I Signed that Contract - San Antonio InnoTech
 
Disaster Recovery Plan - Quorum
Disaster Recovery Plan - QuorumDisaster Recovery Plan - Quorum
Disaster Recovery Plan - QuorumInnoTech
 
Share point saturday access services 2015 final 2
Share point saturday access services 2015 final 2Share point saturday access services 2015 final 2
Share point saturday access services 2015 final 2InnoTech
 
Sp tech festdallas - office 365 groups - planner session
Sp tech festdallas - office 365 groups - planner sessionSp tech festdallas - office 365 groups - planner session
Sp tech festdallas - office 365 groups - planner sessionInnoTech
 
Power apps presentation
Power apps presentationPower apps presentation
Power apps presentationInnoTech
 

Mais de InnoTech (20)

"So you want to raise funding and build a team?"
"So you want to raise funding and build a team?""So you want to raise funding and build a team?"
"So you want to raise funding and build a team?"
 
Artificial Intelligence is Maturing
Artificial Intelligence is MaturingArtificial Intelligence is Maturing
Artificial Intelligence is Maturing
 
What is AI without Data?
What is AI without Data?What is AI without Data?
What is AI without Data?
 
Courageous Leadership - When it Matters Most
Courageous Leadership - When it Matters MostCourageous Leadership - When it Matters Most
Courageous Leadership - When it Matters Most
 
The Gathering Storm
The Gathering StormThe Gathering Storm
The Gathering Storm
 
Sql Server tips from the field
Sql Server tips from the fieldSql Server tips from the field
Sql Server tips from the field
 
Quantum Computing and its security implications
Quantum Computing and its security implicationsQuantum Computing and its security implications
Quantum Computing and its security implications
 
Converged Infrastructure
Converged InfrastructureConverged Infrastructure
Converged Infrastructure
 
Making the most out of collaboration with Office 365
Making the most out of collaboration with Office 365Making the most out of collaboration with Office 365
Making the most out of collaboration with Office 365
 
Blockchain use cases and case studies
Blockchain use cases and case studiesBlockchain use cases and case studies
Blockchain use cases and case studies
 
Blockchain: Exploring the Fundamentals and Promising Potential
Blockchain: Exploring the Fundamentals and Promising Potential Blockchain: Exploring the Fundamentals and Promising Potential
Blockchain: Exploring the Fundamentals and Promising Potential
 
Business leaders are engaging labor differently - Is your IT ready?
Business leaders are engaging labor differently - Is your IT ready?Business leaders are engaging labor differently - Is your IT ready?
Business leaders are engaging labor differently - Is your IT ready?
 
AI 3.0: Is it Finally Time for Artificial Intelligence and Sensor Networks to...
AI 3.0: Is it Finally Time for Artificial Intelligence and Sensor Networks to...AI 3.0: Is it Finally Time for Artificial Intelligence and Sensor Networks to...
AI 3.0: Is it Finally Time for Artificial Intelligence and Sensor Networks to...
 
Using Business Intelligence to Bring Your Data to Life
Using Business Intelligence to Bring Your Data to LifeUsing Business Intelligence to Bring Your Data to Life
Using Business Intelligence to Bring Your Data to Life
 
User requirements is a fallacy
User requirements is a fallacyUser requirements is a fallacy
User requirements is a fallacy
 
What I Wish I Knew Before I Signed that Contract - San Antonio
What I Wish I Knew Before I Signed that Contract - San Antonio What I Wish I Knew Before I Signed that Contract - San Antonio
What I Wish I Knew Before I Signed that Contract - San Antonio
 
Disaster Recovery Plan - Quorum
Disaster Recovery Plan - QuorumDisaster Recovery Plan - Quorum
Disaster Recovery Plan - Quorum
 
Share point saturday access services 2015 final 2
Share point saturday access services 2015 final 2Share point saturday access services 2015 final 2
Share point saturday access services 2015 final 2
 
Sp tech festdallas - office 365 groups - planner session
Sp tech festdallas - office 365 groups - planner sessionSp tech festdallas - office 365 groups - planner session
Sp tech festdallas - office 365 groups - planner session
 
Power apps presentation
Power apps presentationPower apps presentation
Power apps presentation
 

Último

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 

Último (20)

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 

Big Data = Big Decisions

  • 1. BIG DATA = BIG DECISIONS Bob Zurek | SVP Products | Epsilon | www.epsilon.com
  • 3. Consider the following: • New model for data • Accessible over TCP/IP and variety of languages • Initially difficult to understand • Capable of processing thousands of ops/sec • Very different from old model • Threatening as much was invested in old model • Changing course seems ridiculous Source: Eben Hewitt
  • 4. What are we talking about?
  • 5. IBM IMS “IMS is IBM's premier transaction and hierarchical database management system, virtually unsurpassed in database and transaction processing availability and speed” – IBM 2013 “Mission-critical processing that requires unparalleled performance is best served by a hierarchical model. Analytics and business intelligence are best served by a relational model. Most Fortune 100 companies use both.” Source: IBM
  • 6. Data evolution A New Model Is Invented A Disruptive Model A Threatening Model A Competitive Model Source: Eben Hewitt
  • 7. The relational model & SQL A HUGE industry success
  • 9. We have a problem
  • 10. innovation complexity confusion a new model disruption fierce competition Sound familiar?
  • 11. Big data – a growing torrent $600 to buy a disk drive that can store all of the world’s music 5 billion mobile phones in use in 2010 30 pieces of content shared on Facebook every month billion 40% projected growth in global data generated per year vs.5% 235 terabytes data collected by the U.S. Library of Congress by April 2011 growth in global IT spending 15 out of 17 sectors in the United States have more data stored per company than the U.S. Library of Congress Source: McKinsey
  • 12.
  • 13. Industry buzz What is big data, exactly?
  • 14. Big data confusion? What do business executives think “big data” is? A greater scope of information 18% New kinds of data and analysis 16% Real-time information 15% Data influx from new technologies 13% Non-traditional forms of media 13% Large volumes of data 10% The latest buzzword 8% Social media data 7% Source: IBM
  • 15. Big data is… Large pools of data that can be captured, communicated, aggregated, stored, and analyzed Source: McKinsey
  • 16. Another way of looking at it Source: TDWI
  • 17. Is it time to look for an alternative?
  • 18. It’s not that simple, is it?
  • 19. How are we solving (historically)? • Vertical scaling = throw hardware at it • Optimize the application = sql, indexes, access • Employ caching layers = MemcacheD, Coherence • Denormalization = reduce joins • Sharding/Shared Nothing = split the data up • Innovation = columnar
  • 22.
  • 23. Big data innovation incubated Big data innovation incubated A search engine project at Yahoo Doug Cutting = Nutch Google = GFS and GMR
  • 24. eBay erected a Hadoop cluster spanning 530 servers – now five times the size! “Hadoop is an amazing technology stack. We now depend on it to run eBay.” Bob Page, Vice President of Analytics, eBay Source: http://www.wired.com/wiredenterprise/2011/10/how-yahoo-spawned-hadoop/
  • 25. It can get complex and confusing “It replaced our need for ETL” “It is great for batch processing in parallel” “A beautiful platform for all of problems”
  • 26. What it’s not good for • High volume transactional data • Structured data with low latency “Note that Hadoop is not an Extract-Transform- Load (ETL) tool. It is a platform that supports running ETL processes in parallel. The data integration vendors do not compete with Hadoop; rather, Hadoop is another channel for use of their data transformation modules. “ Teradata/Cloudera Presentation
  • 27. What it’s really good for • Index building • Pattern recognitions • Sentiment analysis • Machine generated data • Log processing • Web scale = Google, Twitter, YouTube
  • 28. Use Cases Fraud Detection Spot fraud anomolies Mobile Data Process mobile data Online Travel Reservations IT Security Travel booking Analyze machine generated data Image Processing E-Commerce Large marketplaces Detecting patterns in sat imagery HealthCare Energy Discovery Semantic analysis for relevance Sort and process seismic data Energy Savings Infrastructure Management Suggest ways customers save money Collecting device logs
  • 31. Many shades of grey and lots of great innovations
  • 32. Relational is still in play Some innovations worth a look Dynamically Scaling OLTP = “No Need To Shard”
  • 33. The NoSQL generation • Document Storage Model • Released by NSA to open source • Allows MTV to store • Apache Accumulo hierarchical data • Based on Google Big Table • Flexible schema to model • Built on top of Hadoop structure/data by brand • Fine-grained access control • Needed to have ability • Cell level security to query nested content • Server side programming • No need for a shared disk storage
  • 34. Why NoSQL? • Schemaless model = Easy to to add fields • Document oriented = Json format (think objects) • Built from the ground up to be distributed • Auto sharding • Distributed querying capabilities
  • 35. NoSQL Use Case 1. Click/Event into Hadoop 2. Data Analyzed via Map Reduce jobs; generates 100M profiles based on campaigns running 3. Selected profiles loaded into Couch 4. Ad targeting logic query Couch with sub-second latency to optimize decision and real-time ad placement Source: Couchbase
  • 36. Hadoop Augmentation • Side-by-Side will be commonplace • ETL solutions support Hadoop • Relational Databases • Provide ETL interfaces to Hadoop • Execute map/reduce jobs inside DBMS • NoSQL supports ETL
  • 37. Example Hybrid DBMS Systems Oracle Endeca Server • Hybrid Search/Analytic Database • Supports structured, unstructured, semi-structured • No schema required. Records stacked. • Columnar
  • 38. Trends • SQL On Hadoop – Hadapt, Clodera Impala, EMC • Unified Support of Structured, Unstructured, Semi • Embedding Search • Expanded ETL/ELT Support • Big Data In Motion Takes Hold • Added Data Mining and Analytic Functions In NoSQL • Embedding R Language = gain in popularity • Data Scientists instrumental in business success
  • 39.
  • 40. Bob Zurek | bzurek@epsilon.com