SlideShare uma empresa Scribd logo
1 de 29
Baixar para ler offline
Big Data Analytics


IBM Power Event – Hindsgavl Slot
May 2, 2012




Flemming Bagger, Nordic Sales Leader for Big Data Analytics and Data Warehousing
Søren Ravn, Consulting IT Specialist for Big Data




                                                                                   © 2012 IBM Corporation
Why is 2012 the YEAR of Big Data?

     “Big Data: The next frontier for innovation,
     competition and productivity”
     McKinsey Global Institute

            2012 will be the year of 'big data'
            BBC Nov 30 2011
            Big Data will be the CIO Issue of 2012
            IDC Prediction 2012 report

    Searches for "big data" on Gartner's website
    have increased 981% between March 2011 -
    October 2011


                   “most enterprise data warehouse (EDW) and BI
                   teams currently lack a clear understanding of big
                   data technologies… They are increasingly asking
                   the question, "How can we use big data to
                   deliver new insights?"
                   Gartner 2012



2                                                                      © 2011 IBM Corporation
Insights from the IBM Global CEO Study 2010



Vast majority of CEOs experience the New Economic Environment
as distinctly different
                                                         The New Economic Environment
                      Full Sample                                                            Nordics

      13% 18%                                  69%                          13% 19%                                 68%           More volatile
                                                                                                                                  Deeper/faster cycles, more risk


      14%     21%                              65%                              8% 13%                                   79%      More uncertain
                                                                                                                                  Less predictable


     18%       22%                          60%                     28%             31%                    41%                    More complex
                                                                                                                                  Multi-faceted, interconnected


  26%         21%                        53%                     34%                29%                 37%                       Structurally different
                                                                                                                                  Sustained change

                                                 “Last year’s experience was a wake-up call, like looking into the dark with no
                                                 light at the end of the tunnel.”
                                                                                                                       CEO, Industrial Products, The Netherlands

     Not at all/to a limited extent                   To some extent                 To a large/very large extent
Source: Q7 To what extent is the new economic environment different? Volatile n=1514; Uncertain n=1521; Complex n=1522 ; Structurally different n=1523; Nordics n=83

                                                                                                                                                              © 2010 IBM Corporation
IBM Institute for Business Value


Which underprepared areas are the most critical for CMOs



                            Marketing Priority Matrix                                               1      Data explosion
        Underpreparedness
        Percent of CMOs reporting                                                                   2      Social media
        underpreparedness
                                               1                                                    3      Growth of channel and device choices
  70
                                                                 2                                  4      Shifting consumer demographics
                                                                           3
                                                                                                    5      Financial constraints
                                              4                                                     6      Decreasing brand loyalty
  60
                                       5                                                            7      Growth market opportunities
                                                       6
                       10              7                                        9
                                                                                                    8      ROI accountability
                                                           8
                  11
                                                                                                    9      Customer collaboration and influence
  50                                    12                                                          10     Privacy considerations
                             13
                                                                                                    11     Global outsourcing
                                                       Factors impacting
                                                       marketing                                    12     Regulatory considerations
  40                                                   Percent of CMOs selecting
                                                       as “Top five factors”                        13     Corporate transparency
       0                    20                    40                  60                                   Mean
Source: Q7 Which of the following market factors will have the most impact on your marketing organization over the next 3 to 5 years? n1=1733; Q8 How prepared are you to
        manage the impact of the top 5 market factors that will have the most impact on your marketing organization over the next 3 to 5 years?
4       n2=149 to 1141 (n2 = number of respondents who selected the factor as important in Q7)                                                                  © 2011 IBM Corporation
Information is at the Center                  … And Organizations
of a New Wave of Opportunity…                 Need Deeper Insights

44x                            2020
                           35 zettabytes               Business leaders frequently
as much Data and Content
Over Coming Decade
                                              1 in 3   make decisions based on
                                                       information they don’t trust, or
                                                       don’t have




                                              1 in 2   Business leaders say they don’t
                                                       have access to the information
                                                       they need to do their jobs




                            80%
                                                       of CIOs cited “Business

                                              83%      intelligence and analytics” as
                                                       part of their visionary plans
                                                       to enhance competitiveness
      2009                  Of world’s data
800,000 petabytes           is unstructured
                                                       of CEOs need to do a better job

                                              60%      capturing and understanding
                                                       information rapidly in order to
                                                       make swift business decisions




55                                                                   © 2012 IBM Corporation
The Big Data Conundrum

 The percentage of available data an enterprise can analyze is
 decreasing proportionately to the available to that enterprise

 Quite simply, this means as enterprises, we are getting
 “more naive” about our business over time




                                 Data AVAILABLE to
                                   an organization




                                                    Data an organization
                                                       can PROCESS
 6                                                            © 2012 IBM Corporation
What should a Big Data platform do?
                      Analyze a Variety of Information
                      Novel analytics on a broad set of mixed
                      information that could not be analyzed before


     The 3 Vs
                       Analyze Information in Motion
                       Streaming data analysis
                       Large volume data bursts & ad-hoc analysis



                      Analyze Extreme Volumes of Information
                      Cost-efficiently process and analyze petabytes of information
                      Manage & analyze high volumes of structured, relational data



                       Discover & Experiment
                       Ad-hoc analytics, data discovery &
                       experimentation



                       Manage & Plan
                       Enforce data structure, integrity and control to
                       ensure consistency for repeatable queries
 7                                                              © 2012 IBM Corporation
IBM Big Data Strategy: Move the Analytics Closer to the Data


     Netezza is for High Economic Value data
     that requires deep, extensive and frequent
     analysis with results delivered in minutes

     Streams is for Low Latency, Real Time
     Analysis of high velocity data with results
     delivered sub-second after which the data
     is discarded or stored elsewhere

     Big Insights is for Discovery and
     Exploration on data of uncertain economic
     value to identify patterns and correlations
     which can be proceduralised… it can also
     be used as a lower cost per terabyte store
     of data that is used or accessed in a non-
     time critical manner


 8                                                 © 2012 IBM Corporation
Why Didn’t We Use All of the Big Data Before?




 9                                          © 2012 IBM Corporation
One customer... Two data worlds

                                  Product/Service
                                 •Subscriptions
                                   •Rate Plans                                Virtual Worlds
                                  •Media Type
                            •Category/Classification
                                     •Price
              Customer
                •Segment                               Starts, Stops
            •Social Network                                                                      Collaboration
            •Demographics                              Success Rates
             • Sex, Age Group, etc                        Errors
                 •Tenure
                •Rate plan
      •Credit Rating, ARPU Group
                                                                       Social Networking
                          Network
                          •Availability            Throughput
                       Structured
                      •Throughput/Speed            Setup Time
                       Repeatable
                            •Latency
                           •Location
                                                 Connection Time                                  Content
                                                     Usage
                           Linear
                           •Facilities                                                          Communities
           Transactions sales reports
               Monthly
                                   Interface
         •Voice, Profitability analysis
                 SMS, MMS          •Discovery
       •Data & Web Sessions       •Navigation
                    Customer surveys
           •Click Streams          •Recommendations
             •Purchases
            •Downloads                                     Recency
     •Signaling, Authentication        Device             Frequency
             •Probe/DPI                  •Class           Monetary
                                     •Manufacturer         Latency          Blogs/Micro-blogs
                                        •Model
                                          •OS
                                   •Media Capability
                                    •Keyboard Type




10                                                                                                  © 2012 IBM Corporation
Complementary Approaches for Different Use Cases
                  Traditional Approach                      New Approach
                  Structured, analytical, logical           Creative, holistic thought, intuition




                                Data                                    Hadoop,
                              Warehouse                                 Streams
         Transaction Data                                                                       Web Logs


      Internal App Data                                                                              Social Data
                       Structured
                           Structured                                   Unstructured
                                                                   Unstructured
                       Repeatable                   Enterprise          Exploratory
      Mainframe Data
                           Repeatable
                           Linear                   Integration    Exploratory Text Data: emails
                                                                        Iterative
                             Linear
              Monthly sales reports                                Iterative sentiment
                                                                          Brand
               Profitability analysis                                         Product strategy
         OLTP System Data surveys
                 Customer                                                     MaximumSensor data: images
                                                                                        asset utilization


            ERP data          Traditional                                New                        RFID
                               Sources                                  Sources




 11                                                                                                  © 2012 IBM Corporation
IBM Big Data Strategy: Move the Analytics Closer to the Data


      Netezza is for High Economic Value data
      that requires deep, extensive and frequent
      analysis with results delivered in minutes

      Streams is for Low Latency, Real Time
      Analysis of high velocity data with results
      delivered sub-second after which the data
      is discarded or stored elsewhere

      Big Insights is for Discovery and
      Exploration on data of uncertain economic
      value to identify patterns and correlations
      which can be proceduralised… it can also
      be used as a lower cost per terabyte store
      of data that is used or accessed in a non-
      time critical manner


 12                                                 © 2012 IBM Corporation
InfoSphere Streams: Analyze all your data, all the time, just in time


                                        What if you could get IMMEDIATE insight?
                     Analytic Results
                                         What if you could analyze MORE kinds of data?
                                          What if you could do it with exceptional
                                           price/performance?

                                                               Alerts /
                                                               Actions

                                                                            Billing/
                                                                          Transaction
  More context                                                             Systems


                                                                                 Customer
                                                                                 Real-time
                                                                                  Offers
      Traditional Data,
      Sensor Events,                                                           Threat
                                                                             Prevention
          Signals                                                             Systems
                                                            Enterprise
                                                           Storage and
                                                           Warehousing

 13   13                                                                        © 2012 IBM Corporation
Traditional Computing                                   Stream Computing




     Historical fact finding - Find and analyze     Real time analysis of data-in-motion - analyses
     information stored on disk                     data before you store it

     Batch paradigm, pull model                     A stream of structured or unstructured data

     Query-driven: submits queries to static data   Analytic operations on streaming data
     Relies on Databases, Data Warehouses           in real-time

     Databases find the needle in the haystack      Streams finds the needle as it’s blowing by




          Query
          Query        Data
                       Data       Results
                                  Results                     Data
                                                              Data       Query
                                                                         Query       Results
                                                                                     Results



14                                                                                     © 2012 IBM Corporation
InfoSphere Streams for superior real time analytic processing
            Streams Processing Language (SPL)                            Compile groups of operators
            built for Streaming applications:                            into single processes:
                                                                          Efficient use of cores
                 Reusable operators                                       Distributed execution
                 Rapid application development                            Very fast data exchange
                 Continuous “pipeline” processing                         Can be automatic or tuned
                                                                          Scaled with push of a button
 Use the data that
 gives you a
 competitive
 advantage:
   Can handle virtually
   any data type
   Use data that is too
   expensive and time
   sensitive for traditional
   approaches



Easy to extend:
   Built in adaptors
   Users add capability with
    familiar C++ and Java                                                               Dynamic analysis:
                                                                                             Programmatically change
        Easy to manage:                            Flexible and high                           topology at runtime
                                                   performance transport:                    Create new subscriptions
              Automatic placement
                                                                                             Create new port properties
              Extend applications incrementally       Very low latency
              without downtime
                                                      High data rates
              Multi-user / multiple applications


  15                                                                                                     © 2012 IBM Corporation
IBM Big Data Strategy: Move the Analytics Closer to the Data


      Netezza is for High Economic Value data
      that requires deep, extensive and frequent
      analysis with results delivered in minutes

      Streams is for Low Latency, Real Time
      Analysis of high velocity data with results
      delivered sub-second after which the data
      is discarded or stored elsewhere

      Big Insights is for Discovery and
      Exploration on data of uncertain economic
      value to identify patterns and correlations
      which can be proceduralised… it can also
      be used as a lower cost per terabyte store
      of data that is used or accessed in a non-
      time critical manner


 16                                                 © 2012 IBM Corporation
InfoSphere BigInsights – A Full Hadoop Stack
      User Interface                Integrated                    Management          Development                    Analytics
                                      Install                      Console            Tooling (ODS)                Visualization




      Application                                                                                            Analytics
                                             Pig                    Hive              Jaql




                                                                                                      Avro
                       Zookeeper
                                                                                                                     ML Analytics
                                                              MapReduce

                                                                                 AdaptiveMR                         Text Analytics


                                                                   Oozie                                                 Lucene




      Storage                                                      HBase

                                                   HDFS                        GPFS-SNC



      Data Sources/                Streams           DB2 LUW               Netezza            R
      Connectors
                            Data Stage                    DB2 z            Teradata

                                   Flume              Informix              Oracle




 17                                                                                                                 © 2012 IBM Corporation
What is Hadoop?
      Apache Hadoop – free, open source framework
      for data-intensive applications
       – Inspired by Google technologies (MapReduce, GFS)
       – Originally built to address scalability problems of Web
         search and analytics
       – Extensively used by Yahoo!

      Enables applications to work with thousands of
      nodes and petabytes of data in a highly parallel,
      cost effective manner
       – CPU + disks of a commodity box = Hadoop node
       – Boxes can be combined into clusters
       – New nodes can be added without changing
          • Data formats
          • How data is loaded
          • How jobs are written
                                                                   Processing
      MapReduce framework
       – How Hadoop understands and assigns work to the                         Storage
         nodes (machines)

      Hadoop Distributed File System = HDFS
       – Where Hadoop stores data
       – A file system that spans all the nodes in a Hadoop
         cluster
       – It links together the file systems on many local nodes
         to make them into one big file system
 18                                                                             © 2012 IBM Corporation
Machine Learning Analytics

      SystemML
      – IBM Research invented Machine Learning engine for native use on
        BigInsights

      Directly implementing ML algorithms on MapReduce is difficult
      – Natural mathematical operators need to be re-expressed in terms of
        key-value pairs, map and reduce functions.
      – Data characteristics dictate the optimal MapReduce implementation, so
        user bears responsibility for efficient hand-coding

      Sample Uses
      – Finding non-obvious data correlations over Internet Scale data
        collections
         • E.g. Topic Modeling, Recommender Systems, Ranking, …




 19                                                                © 2012 IBM Corporation
Statistical and Predictive Analysis
      Framework for machine learning (ML) implementations on Big Data
       – Large, sparse data sets, e.g. 5B non-zero values
       – Runs on large BigInsights clusters with 1000s of nodes

      Productivity
       – Build and enhance predictive models directly on Big Data
       – High-level language – Declarative Machine Learning Language (DML)
           • E.g. 1500 lines of Java code boils down to 15 lines of DML code
       – Parallel SPSS data mining algorithms implementable in DML

      Optimization
       – Compile algorithms into optimized parallel code
       – For different clusters
       – For different data characteristics
       – E.g. 1 hr. execution (hand-coded) down to 10 mins
                                                                                                      4500

                                                                                                      4000

                                                                                                      3500




                                                                               Execution Time (sec)
                                                                                                      3000

                                                                                                      2500

                                                                                                      2000

                                                                                                      1500

                                                                                                      1000

                                                                                                       500

                                                                                                        0
                                                                                                             0       500            1000            1500            2000

                                                                                                                            # non zeros (million)

                                                                                                                 Java Map-Reduce     SystemML       Single node R



 20                                                                                                                                © 2012 IBM Corporation
Customer Use Case: Log Analytics (storing computer logs &
transaction data)
  Business Problem: The size and volume of log data generated by computer systems
  constrains the ability of many enterprises to create and maintain effective platforms for
  compliance and analysis.

      IBM Solution:
      – Ingests the all system logging at
        low latency (under 15 minutes) and
        re-assembles the transactions into
        a whole, providing exact details on
        system component response times
        and trending.
      – This solution can store more than a
        year’s worth of data.
      – An analytics layer can be delivered
        through a web front-end, and
        standard browser based tooling for
        ah-hoc analytics.

 21                                                                                © 2012 IBM Corporation
Log Analysis is a Big Data Problem

      Volume
      – Large number of devices
      – Logs generated at hardware, Firmware, OS and middleware,
      – Aggregation over time for predictive analysis generates vast amounts of log
        data

      Velocity
      – Online analysis needed to explore the data to discover meaningful correlations

      Variety
      – Logs formats lack a unified structure
            • Variation across device types, firmware middleware versions
      – Log data needs to be supplement with additional data
            • Performance and Availability/Fault data
            • Reference data




 22                                                                         © 2012 IBM Corporation
Log Analysis - why
     IBM and its customers have huge amounts of log data
        System logs
        Application logs
     We know there is valuable information hidden in these logs
         Anomaly detection: What kind of alerts should I add to my
         automated monitoring system?
         Root cause analysis: What sequence of minor problems caused this
         major problem?
         Resource planning: Where do I need to add redundancy? When
         should a particular machine be replaced?
         Marketing: How can I turn more of the visitors to my site into
         customers?
     But getting that information out requires
         Extraction, transformation and complex statistical analysis at scale


23                                                                   © 2012 IBM Corporation
Insight into your logs

                        Data Analyst,                Analytics
                                                                                       End User
                        Programmer                   Developer

                                                                                                   Reports &
               Import                                                          Ad-Hoc
                Logs
                                        Transform        Analyze
                                                                             Exploration          Dashboard
                                                                                                  Dashboards
                                                                                                    &Alerts
     Import
      Import                                        Analyze
                                                     Analyze
      - -Log files, performance data, fault          –– Sessionization: Identify which records are part of the same
                                                          Sessionization: Identify which records are part of the same
          Log files, performance data, fault             sessions
         data, reference data (network                    sessions
          data, reference data (network              –– Identify subsequences containing fault or performance issue
         topology, device dictionaries)
          topology, device dictionaries)                  Identify subsequences containing fault or performance issue
         from various source systems into
          from various source systems into           –– Observe correlations
                                                          Observe correlations
         HDFS
          HDFS                                       –– Predictive operators
                                                          Predictive operators
     Transform
      Transform                                     Visualize
                                                     Visualize
      –– Identify record boundries, Extract
          Identify record boundries, Extract         –– Ad-Hoc exploration with BigSheets
         information from text, Identify                 Ad-Hoc exploration with BigSheets
          information from text, Identify
         patterns
          patterns                                   –– Institutionalizing the knowledge gleaned from Ad-Hoc
                                                         Institutionalizing the knowledge gleaned from Ad-Hoc
                                                        exploration (Network operating center dashboards, reports,
                                                         exploration (Network operating center dashboards, reports,
      –– Find cross log relationships and
          Find cross log relationships and              alerts)
         integration across diverse data                 alerts)
          integration across diverse data
         sources
          sources
      –– Build indexes
          Build indexes
24
 24                                                                                                   © 2012 IBM Corporation
Optimizing capital investments based on double-digit Petabyte analysis




   Business Challenge
                                                                                Solution Components:
     Wind turbines are expensive, have a service life of ~25 years
                                                                                 IBM InfoSphere
     Existing process for turbine placements requires weeks of analysis, uses    BigInsights Enterprise
     subset of available data and does not yield optimal results                 Edition:
   Project objectives                                                               GPFS-based file
     Leverage large volume of weather data to optimize placement of turbines.       system capable of
     (2+ PB today; ~20 PB by 2015)                                                  running Hadoop and
     Reduce modeling time from weeks to hours.                                      non-Hadoop apps
     Analyze data from turbines to optimize ongoing operations.                     Powerful, extensible
                                                                                    query support (JAQL)
   The benefits                                                                     Read-optimized
                                                                                    column storage
     Clear fulfillment of Vestas business needs through IBM technology and
     expertise                                                                   IBM xSeries hardware
     Reliability, security, scalability, and integration needs fulfilled
     Standard enterprise software support
     Single-vendor solution for software, hardware, storage, support



   25                                                                                       © 2012 IBM Corporation
The Big Data Challenge
 7/25/2008               Google passes 1 trillion URLs

 $187/second             Cost of last Ebay outage ($16,156,800/Day)

 789.4 PB                Current size of YouTube

 2/4/2011                IPv4 address space is exhausted, 4.3 billion
                         addresses have been allocated

 (340x1038)              Size of IPv6 address space

 100 million gigabytes   Size of Google’s index

 144 million             Number of Tweets per day

 1.7 trillion            Items at Facebook - 90 PB of data

 4.3 Billion             Mobile devices

 26                                                               © 2012 IBM Corporation
The Big Data Challenge
The Biggest Big Data challenge of our future
     –   Humans are limited
     –   Sensors are unbounded
     –   “Sensorization” of everything means
     –   Everything is a sensor

     The problem
     – Don’t know the future value of a dot today
     – Cannot connect dots we don’t have




27                                                  © 2012 IBM Corporation
Current approaches might not be enough in the future




          Understand current state and desired state …

 28                                                      © 2012 IBM Corporation
THINK

29
      ibm.com/bigdata   © 2012 IBM Corporation

Mais conteúdo relacionado

Semelhante a Big Data, IBM Power Event

Emulex and Enterprise Strategy Group Present Why I/O is Strategic for Virtual...
Emulex and Enterprise Strategy Group Present Why I/O is Strategic for Virtual...Emulex and Enterprise Strategy Group Present Why I/O is Strategic for Virtual...
Emulex and Enterprise Strategy Group Present Why I/O is Strategic for Virtual...Emulex Corporation
 
Integrated marketing for the customer journey
Integrated marketing for the customer journeyIntegrated marketing for the customer journey
Integrated marketing for the customer journeyIBM
 
Worldwide Business Research
Worldwide Business ResearchWorldwide Business Research
Worldwide Business Researchwbr_marketing
 
Mon1545 powerof cloud-dougclark-ibm
Mon1545 powerof cloud-dougclark-ibmMon1545 powerof cloud-dougclark-ibm
Mon1545 powerof cloud-dougclark-ibmeurocloud
 
Future of Open Source 2011 Survey, Open Source Business Conference
Future of Open Source 2011 Survey, Open Source Business ConferenceFuture of Open Source 2011 Survey, Open Source Business Conference
Future of Open Source 2011 Survey, Open Source Business ConferenceAcquia
 
A Customer Centricity Paradox - Tim Suther at Digiday Brand Conference #digiday
A Customer Centricity Paradox - Tim Suther at Digiday Brand Conference #digidayA Customer Centricity Paradox - Tim Suther at Digiday Brand Conference #digiday
A Customer Centricity Paradox - Tim Suther at Digiday Brand Conference #digidayAcxiom Corporation
 
Digiday Brand Conference: State of the Industry with Acxiom: Better Connectio...
Digiday Brand Conference: State of the Industry with Acxiom: Better Connectio...Digiday Brand Conference: State of the Industry with Acxiom: Better Connectio...
Digiday Brand Conference: State of the Industry with Acxiom: Better Connectio...Digiday
 
Building a Meaningful Customer Experience on a Global Scale
Building a Meaningful Customer Experience on a Global ScaleBuilding a Meaningful Customer Experience on a Global Scale
Building a Meaningful Customer Experience on a Global ScaleRoman Nedielka
 
Big data ibm keynote d advani presentation
Big data ibm keynote d advani presentationBig data ibm keynote d advani presentation
Big data ibm keynote d advani presentationMassTLC
 
IBM Confidently Provide Guidance with IBM Cognos TM1 and What-if Analysis
IBM Confidently Provide Guidance with IBM Cognos TM1 and What-if AnalysisIBM Confidently Provide Guidance with IBM Cognos TM1 and What-if Analysis
IBM Confidently Provide Guidance with IBM Cognos TM1 and What-if AnalysisIBM Sverige
 
CompTIA 3Q Research Round-Up
CompTIA 3Q Research Round-UpCompTIA 3Q Research Round-Up
CompTIA 3Q Research Round-UpCompTIA
 
2008 Colocation Industry Trends Report
2008 Colocation Industry Trends Report2008 Colocation Industry Trends Report
2008 Colocation Industry Trends Reportahollobaugh
 
Emergence of Big Data in Digital Marketing
Emergence of Big Data  in Digital MarketingEmergence of Big Data  in Digital Marketing
Emergence of Big Data in Digital MarketingKrishnan Parasuraman
 
Web 2.0 - Social Media Trilogy - Vital Components for an Enterprise Strategy
Web 2.0 - Social Media Trilogy - Vital Components for an Enterprise StrategyWeb 2.0 - Social Media Trilogy - Vital Components for an Enterprise Strategy
Web 2.0 - Social Media Trilogy - Vital Components for an Enterprise StrategyGerardo A Dada
 
Risk Management_Consulting Industry
Risk Management_Consulting  IndustryRisk Management_Consulting  Industry
Risk Management_Consulting IndustryAtul Singh
 
IBM Software Day 2013. Banking trends and transformation
IBM Software Day 2013. Banking trends and transformationIBM Software Day 2013. Banking trends and transformation
IBM Software Day 2013. Banking trends and transformationIBM (Middle East and Africa)
 
Consumer Survey United States
Consumer Survey United StatesConsumer Survey United States
Consumer Survey United StatesRob Van Den Dam
 
2011 Consumer Survey France
2011 Consumer Survey France2011 Consumer Survey France
2011 Consumer Survey FranceRob Van Den Dam
 
Digital Update USC #CorpGov Summit
Digital Update USC #CorpGov SummitDigital Update USC #CorpGov Summit
Digital Update USC #CorpGov SummitFayFeeney
 

Semelhante a Big Data, IBM Power Event (20)

Emulex and Enterprise Strategy Group Present Why I/O is Strategic for Virtual...
Emulex and Enterprise Strategy Group Present Why I/O is Strategic for Virtual...Emulex and Enterprise Strategy Group Present Why I/O is Strategic for Virtual...
Emulex and Enterprise Strategy Group Present Why I/O is Strategic for Virtual...
 
Integrated marketing for the customer journey
Integrated marketing for the customer journeyIntegrated marketing for the customer journey
Integrated marketing for the customer journey
 
Worldwide Business Research
Worldwide Business ResearchWorldwide Business Research
Worldwide Business Research
 
Mon1545 powerof cloud-dougclark-ibm
Mon1545 powerof cloud-dougclark-ibmMon1545 powerof cloud-dougclark-ibm
Mon1545 powerof cloud-dougclark-ibm
 
Future of Open Source 2011 Survey, Open Source Business Conference
Future of Open Source 2011 Survey, Open Source Business ConferenceFuture of Open Source 2011 Survey, Open Source Business Conference
Future of Open Source 2011 Survey, Open Source Business Conference
 
A Customer Centricity Paradox - Tim Suther at Digiday Brand Conference #digiday
A Customer Centricity Paradox - Tim Suther at Digiday Brand Conference #digidayA Customer Centricity Paradox - Tim Suther at Digiday Brand Conference #digiday
A Customer Centricity Paradox - Tim Suther at Digiday Brand Conference #digiday
 
Digiday Brand Conference: State of the Industry with Acxiom: Better Connectio...
Digiday Brand Conference: State of the Industry with Acxiom: Better Connectio...Digiday Brand Conference: State of the Industry with Acxiom: Better Connectio...
Digiday Brand Conference: State of the Industry with Acxiom: Better Connectio...
 
Building a Meaningful Customer Experience on a Global Scale
Building a Meaningful Customer Experience on a Global ScaleBuilding a Meaningful Customer Experience on a Global Scale
Building a Meaningful Customer Experience on a Global Scale
 
Big data ibm keynote d advani presentation
Big data ibm keynote d advani presentationBig data ibm keynote d advani presentation
Big data ibm keynote d advani presentation
 
IBM Confidently Provide Guidance with IBM Cognos TM1 and What-if Analysis
IBM Confidently Provide Guidance with IBM Cognos TM1 and What-if AnalysisIBM Confidently Provide Guidance with IBM Cognos TM1 and What-if Analysis
IBM Confidently Provide Guidance with IBM Cognos TM1 and What-if Analysis
 
CompTIA 3Q Research Round-Up
CompTIA 3Q Research Round-UpCompTIA 3Q Research Round-Up
CompTIA 3Q Research Round-Up
 
2008 Colocation Industry Trends Report
2008 Colocation Industry Trends Report2008 Colocation Industry Trends Report
2008 Colocation Industry Trends Report
 
Emergence of Big Data in Digital Marketing
Emergence of Big Data  in Digital MarketingEmergence of Big Data  in Digital Marketing
Emergence of Big Data in Digital Marketing
 
US Market Study
US Market StudyUS Market Study
US Market Study
 
Web 2.0 - Social Media Trilogy - Vital Components for an Enterprise Strategy
Web 2.0 - Social Media Trilogy - Vital Components for an Enterprise StrategyWeb 2.0 - Social Media Trilogy - Vital Components for an Enterprise Strategy
Web 2.0 - Social Media Trilogy - Vital Components for an Enterprise Strategy
 
Risk Management_Consulting Industry
Risk Management_Consulting  IndustryRisk Management_Consulting  Industry
Risk Management_Consulting Industry
 
IBM Software Day 2013. Banking trends and transformation
IBM Software Day 2013. Banking trends and transformationIBM Software Day 2013. Banking trends and transformation
IBM Software Day 2013. Banking trends and transformation
 
Consumer Survey United States
Consumer Survey United StatesConsumer Survey United States
Consumer Survey United States
 
2011 Consumer Survey France
2011 Consumer Survey France2011 Consumer Survey France
2011 Consumer Survey France
 
Digital Update USC #CorpGov Summit
Digital Update USC #CorpGov SummitDigital Update USC #CorpGov Summit
Digital Update USC #CorpGov Summit
 

Mais de IBM Danmark

DevOps, Development and Operations, Tina McGinley
DevOps, Development and Operations, Tina McGinleyDevOps, Development and Operations, Tina McGinley
DevOps, Development and Operations, Tina McGinleyIBM Danmark
 
Velkomst, Universitetssporet 2013, Pia Rønhøj
Velkomst, Universitetssporet 2013, Pia RønhøjVelkomst, Universitetssporet 2013, Pia Rønhøj
Velkomst, Universitetssporet 2013, Pia RønhøjIBM Danmark
 
Smarter Commerce, Salg og Marketing, Thomas Steglich-Andersen
Smarter Commerce, Salg og Marketing, Thomas Steglich-AndersenSmarter Commerce, Salg og Marketing, Thomas Steglich-Andersen
Smarter Commerce, Salg og Marketing, Thomas Steglich-AndersenIBM Danmark
 
Mobile, Philip Nyborg
Mobile, Philip NyborgMobile, Philip Nyborg
Mobile, Philip NyborgIBM Danmark
 
IT innovation, Kim Escherich
IT innovation, Kim EscherichIT innovation, Kim Escherich
IT innovation, Kim EscherichIBM Danmark
 
Echo.IT, Stefan K. Madsen
Echo.IT, Stefan K. MadsenEcho.IT, Stefan K. Madsen
Echo.IT, Stefan K. MadsenIBM Danmark
 
Big Data & Analytics, Peter Jönsson
Big Data & Analytics, Peter JönssonBig Data & Analytics, Peter Jönsson
Big Data & Analytics, Peter JönssonIBM Danmark
 
Social Business, Alice Bayer
Social Business, Alice BayerSocial Business, Alice Bayer
Social Business, Alice BayerIBM Danmark
 
Numascale Product IBM
Numascale Product IBMNumascale Product IBM
Numascale Product IBMIBM Danmark
 
Intel HPC Update
Intel HPC UpdateIntel HPC Update
Intel HPC UpdateIBM Danmark
 
IBM general parallel file system - introduction
IBM general parallel file system - introductionIBM general parallel file system - introduction
IBM general parallel file system - introductionIBM Danmark
 
NeXtScale HPC seminar
NeXtScale HPC seminarNeXtScale HPC seminar
NeXtScale HPC seminarIBM Danmark
 
Future of Power: PowerLinux - Jan Kristian Nielsen
Future of Power: PowerLinux - Jan Kristian NielsenFuture of Power: PowerLinux - Jan Kristian Nielsen
Future of Power: PowerLinux - Jan Kristian NielsenIBM Danmark
 
Future of Power: Power Strategy and Offerings for Denmark - Steve Sibley
Future of Power: Power Strategy and Offerings for Denmark - Steve SibleyFuture of Power: Power Strategy and Offerings for Denmark - Steve Sibley
Future of Power: Power Strategy and Offerings for Denmark - Steve SibleyIBM Danmark
 
Future of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren RavnFuture of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren RavnIBM Danmark
 
Future of Power: IBM PureFlex - Kim Mortensen
Future of Power: IBM PureFlex - Kim MortensenFuture of Power: IBM PureFlex - Kim Mortensen
Future of Power: IBM PureFlex - Kim MortensenIBM Danmark
 
Future of Power: IBM Trends & Directions - Erik Rex
Future of Power: IBM Trends & Directions - Erik RexFuture of Power: IBM Trends & Directions - Erik Rex
Future of Power: IBM Trends & Directions - Erik RexIBM Danmark
 
Future of Power: Håndtering af nye teknologier - Kim Escherich
Future of Power: Håndtering af nye teknologier - Kim EscherichFuture of Power: Håndtering af nye teknologier - Kim Escherich
Future of Power: Håndtering af nye teknologier - Kim EscherichIBM Danmark
 
Future of Power - Lars Mikkelgaard-Jensen
Future of Power - Lars Mikkelgaard-JensenFuture of Power - Lars Mikkelgaard-Jensen
Future of Power - Lars Mikkelgaard-JensenIBM Danmark
 

Mais de IBM Danmark (20)

DevOps, Development and Operations, Tina McGinley
DevOps, Development and Operations, Tina McGinleyDevOps, Development and Operations, Tina McGinley
DevOps, Development and Operations, Tina McGinley
 
Velkomst, Universitetssporet 2013, Pia Rønhøj
Velkomst, Universitetssporet 2013, Pia RønhøjVelkomst, Universitetssporet 2013, Pia Rønhøj
Velkomst, Universitetssporet 2013, Pia Rønhøj
 
Smarter Commerce, Salg og Marketing, Thomas Steglich-Andersen
Smarter Commerce, Salg og Marketing, Thomas Steglich-AndersenSmarter Commerce, Salg og Marketing, Thomas Steglich-Andersen
Smarter Commerce, Salg og Marketing, Thomas Steglich-Andersen
 
Mobile, Philip Nyborg
Mobile, Philip NyborgMobile, Philip Nyborg
Mobile, Philip Nyborg
 
IT innovation, Kim Escherich
IT innovation, Kim EscherichIT innovation, Kim Escherich
IT innovation, Kim Escherich
 
Echo.IT, Stefan K. Madsen
Echo.IT, Stefan K. MadsenEcho.IT, Stefan K. Madsen
Echo.IT, Stefan K. Madsen
 
Big Data & Analytics, Peter Jönsson
Big Data & Analytics, Peter JönssonBig Data & Analytics, Peter Jönsson
Big Data & Analytics, Peter Jönsson
 
Social Business, Alice Bayer
Social Business, Alice BayerSocial Business, Alice Bayer
Social Business, Alice Bayer
 
Numascale Product IBM
Numascale Product IBMNumascale Product IBM
Numascale Product IBM
 
Mellanox IBM
Mellanox IBMMellanox IBM
Mellanox IBM
 
Intel HPC Update
Intel HPC UpdateIntel HPC Update
Intel HPC Update
 
IBM general parallel file system - introduction
IBM general parallel file system - introductionIBM general parallel file system - introduction
IBM general parallel file system - introduction
 
NeXtScale HPC seminar
NeXtScale HPC seminarNeXtScale HPC seminar
NeXtScale HPC seminar
 
Future of Power: PowerLinux - Jan Kristian Nielsen
Future of Power: PowerLinux - Jan Kristian NielsenFuture of Power: PowerLinux - Jan Kristian Nielsen
Future of Power: PowerLinux - Jan Kristian Nielsen
 
Future of Power: Power Strategy and Offerings for Denmark - Steve Sibley
Future of Power: Power Strategy and Offerings for Denmark - Steve SibleyFuture of Power: Power Strategy and Offerings for Denmark - Steve Sibley
Future of Power: Power Strategy and Offerings for Denmark - Steve Sibley
 
Future of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren RavnFuture of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren Ravn
 
Future of Power: IBM PureFlex - Kim Mortensen
Future of Power: IBM PureFlex - Kim MortensenFuture of Power: IBM PureFlex - Kim Mortensen
Future of Power: IBM PureFlex - Kim Mortensen
 
Future of Power: IBM Trends & Directions - Erik Rex
Future of Power: IBM Trends & Directions - Erik RexFuture of Power: IBM Trends & Directions - Erik Rex
Future of Power: IBM Trends & Directions - Erik Rex
 
Future of Power: Håndtering af nye teknologier - Kim Escherich
Future of Power: Håndtering af nye teknologier - Kim EscherichFuture of Power: Håndtering af nye teknologier - Kim Escherich
Future of Power: Håndtering af nye teknologier - Kim Escherich
 
Future of Power - Lars Mikkelgaard-Jensen
Future of Power - Lars Mikkelgaard-JensenFuture of Power - Lars Mikkelgaard-Jensen
Future of Power - Lars Mikkelgaard-Jensen
 

Último

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 

Último (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

Big Data, IBM Power Event

  • 1. Big Data Analytics IBM Power Event – Hindsgavl Slot May 2, 2012 Flemming Bagger, Nordic Sales Leader for Big Data Analytics and Data Warehousing Søren Ravn, Consulting IT Specialist for Big Data © 2012 IBM Corporation
  • 2. Why is 2012 the YEAR of Big Data? “Big Data: The next frontier for innovation, competition and productivity” McKinsey Global Institute 2012 will be the year of 'big data' BBC Nov 30 2011 Big Data will be the CIO Issue of 2012 IDC Prediction 2012 report Searches for "big data" on Gartner's website have increased 981% between March 2011 - October 2011 “most enterprise data warehouse (EDW) and BI teams currently lack a clear understanding of big data technologies… They are increasingly asking the question, "How can we use big data to deliver new insights?" Gartner 2012 2 © 2011 IBM Corporation
  • 3. Insights from the IBM Global CEO Study 2010 Vast majority of CEOs experience the New Economic Environment as distinctly different The New Economic Environment Full Sample Nordics 13% 18% 69% 13% 19% 68% More volatile Deeper/faster cycles, more risk 14% 21% 65% 8% 13% 79% More uncertain Less predictable 18% 22% 60% 28% 31% 41% More complex Multi-faceted, interconnected 26% 21% 53% 34% 29% 37% Structurally different Sustained change “Last year’s experience was a wake-up call, like looking into the dark with no light at the end of the tunnel.” CEO, Industrial Products, The Netherlands Not at all/to a limited extent To some extent To a large/very large extent Source: Q7 To what extent is the new economic environment different? Volatile n=1514; Uncertain n=1521; Complex n=1522 ; Structurally different n=1523; Nordics n=83 © 2010 IBM Corporation
  • 4. IBM Institute for Business Value Which underprepared areas are the most critical for CMOs Marketing Priority Matrix 1 Data explosion Underpreparedness Percent of CMOs reporting 2 Social media underpreparedness 1 3 Growth of channel and device choices 70 2 4 Shifting consumer demographics 3 5 Financial constraints 4 6 Decreasing brand loyalty 60 5 7 Growth market opportunities 6 10 7 9 8 ROI accountability 8 11 9 Customer collaboration and influence 50 12 10 Privacy considerations 13 11 Global outsourcing Factors impacting marketing 12 Regulatory considerations 40 Percent of CMOs selecting as “Top five factors” 13 Corporate transparency 0 20 40 60 Mean Source: Q7 Which of the following market factors will have the most impact on your marketing organization over the next 3 to 5 years? n1=1733; Q8 How prepared are you to manage the impact of the top 5 market factors that will have the most impact on your marketing organization over the next 3 to 5 years? 4 n2=149 to 1141 (n2 = number of respondents who selected the factor as important in Q7) © 2011 IBM Corporation
  • 5. Information is at the Center … And Organizations of a New Wave of Opportunity… Need Deeper Insights 44x 2020 35 zettabytes Business leaders frequently as much Data and Content Over Coming Decade 1 in 3 make decisions based on information they don’t trust, or don’t have 1 in 2 Business leaders say they don’t have access to the information they need to do their jobs 80% of CIOs cited “Business 83% intelligence and analytics” as part of their visionary plans to enhance competitiveness 2009 Of world’s data 800,000 petabytes is unstructured of CEOs need to do a better job 60% capturing and understanding information rapidly in order to make swift business decisions 55 © 2012 IBM Corporation
  • 6. The Big Data Conundrum The percentage of available data an enterprise can analyze is decreasing proportionately to the available to that enterprise Quite simply, this means as enterprises, we are getting “more naive” about our business over time Data AVAILABLE to an organization Data an organization can PROCESS 6 © 2012 IBM Corporation
  • 7. What should a Big Data platform do? Analyze a Variety of Information Novel analytics on a broad set of mixed information that could not be analyzed before The 3 Vs Analyze Information in Motion Streaming data analysis Large volume data bursts & ad-hoc analysis Analyze Extreme Volumes of Information Cost-efficiently process and analyze petabytes of information Manage & analyze high volumes of structured, relational data Discover & Experiment Ad-hoc analytics, data discovery & experimentation Manage & Plan Enforce data structure, integrity and control to ensure consistency for repeatable queries 7 © 2012 IBM Corporation
  • 8. IBM Big Data Strategy: Move the Analytics Closer to the Data Netezza is for High Economic Value data that requires deep, extensive and frequent analysis with results delivered in minutes Streams is for Low Latency, Real Time Analysis of high velocity data with results delivered sub-second after which the data is discarded or stored elsewhere Big Insights is for Discovery and Exploration on data of uncertain economic value to identify patterns and correlations which can be proceduralised… it can also be used as a lower cost per terabyte store of data that is used or accessed in a non- time critical manner 8 © 2012 IBM Corporation
  • 9. Why Didn’t We Use All of the Big Data Before? 9 © 2012 IBM Corporation
  • 10. One customer... Two data worlds Product/Service •Subscriptions •Rate Plans Virtual Worlds •Media Type •Category/Classification •Price Customer •Segment Starts, Stops •Social Network Collaboration •Demographics Success Rates • Sex, Age Group, etc Errors •Tenure •Rate plan •Credit Rating, ARPU Group Social Networking Network •Availability Throughput Structured •Throughput/Speed Setup Time Repeatable •Latency •Location Connection Time Content Usage Linear •Facilities Communities Transactions sales reports Monthly Interface •Voice, Profitability analysis SMS, MMS •Discovery •Data & Web Sessions •Navigation Customer surveys •Click Streams •Recommendations •Purchases •Downloads Recency •Signaling, Authentication Device Frequency •Probe/DPI •Class Monetary •Manufacturer Latency Blogs/Micro-blogs •Model •OS •Media Capability •Keyboard Type 10 © 2012 IBM Corporation
  • 11. Complementary Approaches for Different Use Cases Traditional Approach New Approach Structured, analytical, logical Creative, holistic thought, intuition Data Hadoop, Warehouse Streams Transaction Data Web Logs Internal App Data Social Data Structured Structured Unstructured Unstructured Repeatable Enterprise Exploratory Mainframe Data Repeatable Linear Integration Exploratory Text Data: emails Iterative Linear Monthly sales reports Iterative sentiment Brand Profitability analysis Product strategy OLTP System Data surveys Customer MaximumSensor data: images asset utilization ERP data Traditional New RFID Sources Sources 11 © 2012 IBM Corporation
  • 12. IBM Big Data Strategy: Move the Analytics Closer to the Data Netezza is for High Economic Value data that requires deep, extensive and frequent analysis with results delivered in minutes Streams is for Low Latency, Real Time Analysis of high velocity data with results delivered sub-second after which the data is discarded or stored elsewhere Big Insights is for Discovery and Exploration on data of uncertain economic value to identify patterns and correlations which can be proceduralised… it can also be used as a lower cost per terabyte store of data that is used or accessed in a non- time critical manner 12 © 2012 IBM Corporation
  • 13. InfoSphere Streams: Analyze all your data, all the time, just in time What if you could get IMMEDIATE insight? Analytic Results What if you could analyze MORE kinds of data? What if you could do it with exceptional price/performance? Alerts / Actions Billing/ Transaction More context Systems Customer Real-time Offers Traditional Data, Sensor Events, Threat Prevention Signals Systems Enterprise Storage and Warehousing 13 13 © 2012 IBM Corporation
  • 14. Traditional Computing Stream Computing Historical fact finding - Find and analyze Real time analysis of data-in-motion - analyses information stored on disk data before you store it Batch paradigm, pull model A stream of structured or unstructured data Query-driven: submits queries to static data Analytic operations on streaming data Relies on Databases, Data Warehouses in real-time Databases find the needle in the haystack Streams finds the needle as it’s blowing by Query Query Data Data Results Results Data Data Query Query Results Results 14 © 2012 IBM Corporation
  • 15. InfoSphere Streams for superior real time analytic processing Streams Processing Language (SPL) Compile groups of operators built for Streaming applications: into single processes: Efficient use of cores Reusable operators Distributed execution Rapid application development Very fast data exchange Continuous “pipeline” processing Can be automatic or tuned Scaled with push of a button Use the data that gives you a competitive advantage: Can handle virtually any data type Use data that is too expensive and time sensitive for traditional approaches Easy to extend: Built in adaptors Users add capability with familiar C++ and Java Dynamic analysis: Programmatically change Easy to manage: Flexible and high topology at runtime performance transport: Create new subscriptions Automatic placement Create new port properties Extend applications incrementally Very low latency without downtime High data rates Multi-user / multiple applications 15 © 2012 IBM Corporation
  • 16. IBM Big Data Strategy: Move the Analytics Closer to the Data Netezza is for High Economic Value data that requires deep, extensive and frequent analysis with results delivered in minutes Streams is for Low Latency, Real Time Analysis of high velocity data with results delivered sub-second after which the data is discarded or stored elsewhere Big Insights is for Discovery and Exploration on data of uncertain economic value to identify patterns and correlations which can be proceduralised… it can also be used as a lower cost per terabyte store of data that is used or accessed in a non- time critical manner 16 © 2012 IBM Corporation
  • 17. InfoSphere BigInsights – A Full Hadoop Stack User Interface Integrated Management Development Analytics Install Console Tooling (ODS) Visualization Application Analytics Pig Hive Jaql Avro Zookeeper ML Analytics MapReduce AdaptiveMR Text Analytics Oozie Lucene Storage HBase HDFS GPFS-SNC Data Sources/ Streams DB2 LUW Netezza R Connectors Data Stage DB2 z Teradata Flume Informix Oracle 17 © 2012 IBM Corporation
  • 18. What is Hadoop? Apache Hadoop – free, open source framework for data-intensive applications – Inspired by Google technologies (MapReduce, GFS) – Originally built to address scalability problems of Web search and analytics – Extensively used by Yahoo! Enables applications to work with thousands of nodes and petabytes of data in a highly parallel, cost effective manner – CPU + disks of a commodity box = Hadoop node – Boxes can be combined into clusters – New nodes can be added without changing • Data formats • How data is loaded • How jobs are written Processing MapReduce framework – How Hadoop understands and assigns work to the Storage nodes (machines) Hadoop Distributed File System = HDFS – Where Hadoop stores data – A file system that spans all the nodes in a Hadoop cluster – It links together the file systems on many local nodes to make them into one big file system 18 © 2012 IBM Corporation
  • 19. Machine Learning Analytics SystemML – IBM Research invented Machine Learning engine for native use on BigInsights Directly implementing ML algorithms on MapReduce is difficult – Natural mathematical operators need to be re-expressed in terms of key-value pairs, map and reduce functions. – Data characteristics dictate the optimal MapReduce implementation, so user bears responsibility for efficient hand-coding Sample Uses – Finding non-obvious data correlations over Internet Scale data collections • E.g. Topic Modeling, Recommender Systems, Ranking, … 19 © 2012 IBM Corporation
  • 20. Statistical and Predictive Analysis Framework for machine learning (ML) implementations on Big Data – Large, sparse data sets, e.g. 5B non-zero values – Runs on large BigInsights clusters with 1000s of nodes Productivity – Build and enhance predictive models directly on Big Data – High-level language – Declarative Machine Learning Language (DML) • E.g. 1500 lines of Java code boils down to 15 lines of DML code – Parallel SPSS data mining algorithms implementable in DML Optimization – Compile algorithms into optimized parallel code – For different clusters – For different data characteristics – E.g. 1 hr. execution (hand-coded) down to 10 mins 4500 4000 3500 Execution Time (sec) 3000 2500 2000 1500 1000 500 0 0 500 1000 1500 2000 # non zeros (million) Java Map-Reduce SystemML Single node R 20 © 2012 IBM Corporation
  • 21. Customer Use Case: Log Analytics (storing computer logs & transaction data) Business Problem: The size and volume of log data generated by computer systems constrains the ability of many enterprises to create and maintain effective platforms for compliance and analysis. IBM Solution: – Ingests the all system logging at low latency (under 15 minutes) and re-assembles the transactions into a whole, providing exact details on system component response times and trending. – This solution can store more than a year’s worth of data. – An analytics layer can be delivered through a web front-end, and standard browser based tooling for ah-hoc analytics. 21 © 2012 IBM Corporation
  • 22. Log Analysis is a Big Data Problem Volume – Large number of devices – Logs generated at hardware, Firmware, OS and middleware, – Aggregation over time for predictive analysis generates vast amounts of log data Velocity – Online analysis needed to explore the data to discover meaningful correlations Variety – Logs formats lack a unified structure • Variation across device types, firmware middleware versions – Log data needs to be supplement with additional data • Performance and Availability/Fault data • Reference data 22 © 2012 IBM Corporation
  • 23. Log Analysis - why IBM and its customers have huge amounts of log data System logs Application logs We know there is valuable information hidden in these logs Anomaly detection: What kind of alerts should I add to my automated monitoring system? Root cause analysis: What sequence of minor problems caused this major problem? Resource planning: Where do I need to add redundancy? When should a particular machine be replaced? Marketing: How can I turn more of the visitors to my site into customers? But getting that information out requires Extraction, transformation and complex statistical analysis at scale 23 © 2012 IBM Corporation
  • 24. Insight into your logs Data Analyst, Analytics End User Programmer Developer Reports & Import Ad-Hoc Logs Transform Analyze Exploration Dashboard Dashboards &Alerts Import Import Analyze Analyze - -Log files, performance data, fault –– Sessionization: Identify which records are part of the same Sessionization: Identify which records are part of the same Log files, performance data, fault sessions data, reference data (network sessions data, reference data (network –– Identify subsequences containing fault or performance issue topology, device dictionaries) topology, device dictionaries) Identify subsequences containing fault or performance issue from various source systems into from various source systems into –– Observe correlations Observe correlations HDFS HDFS –– Predictive operators Predictive operators Transform Transform Visualize Visualize –– Identify record boundries, Extract Identify record boundries, Extract –– Ad-Hoc exploration with BigSheets information from text, Identify Ad-Hoc exploration with BigSheets information from text, Identify patterns patterns –– Institutionalizing the knowledge gleaned from Ad-Hoc Institutionalizing the knowledge gleaned from Ad-Hoc exploration (Network operating center dashboards, reports, exploration (Network operating center dashboards, reports, –– Find cross log relationships and Find cross log relationships and alerts) integration across diverse data alerts) integration across diverse data sources sources –– Build indexes Build indexes 24 24 © 2012 IBM Corporation
  • 25. Optimizing capital investments based on double-digit Petabyte analysis Business Challenge Solution Components: Wind turbines are expensive, have a service life of ~25 years IBM InfoSphere Existing process for turbine placements requires weeks of analysis, uses BigInsights Enterprise subset of available data and does not yield optimal results Edition: Project objectives GPFS-based file Leverage large volume of weather data to optimize placement of turbines. system capable of (2+ PB today; ~20 PB by 2015) running Hadoop and Reduce modeling time from weeks to hours. non-Hadoop apps Analyze data from turbines to optimize ongoing operations. Powerful, extensible query support (JAQL) The benefits Read-optimized column storage Clear fulfillment of Vestas business needs through IBM technology and expertise IBM xSeries hardware Reliability, security, scalability, and integration needs fulfilled Standard enterprise software support Single-vendor solution for software, hardware, storage, support 25 © 2012 IBM Corporation
  • 26. The Big Data Challenge 7/25/2008 Google passes 1 trillion URLs $187/second Cost of last Ebay outage ($16,156,800/Day) 789.4 PB Current size of YouTube 2/4/2011 IPv4 address space is exhausted, 4.3 billion addresses have been allocated (340x1038) Size of IPv6 address space 100 million gigabytes Size of Google’s index 144 million Number of Tweets per day 1.7 trillion Items at Facebook - 90 PB of data 4.3 Billion Mobile devices 26 © 2012 IBM Corporation
  • 27. The Big Data Challenge The Biggest Big Data challenge of our future – Humans are limited – Sensors are unbounded – “Sensorization” of everything means – Everything is a sensor The problem – Don’t know the future value of a dot today – Cannot connect dots we don’t have 27 © 2012 IBM Corporation
  • 28. Current approaches might not be enough in the future Understand current state and desired state … 28 © 2012 IBM Corporation
  • 29. THINK 29 ibm.com/bigdata © 2012 IBM Corporation