SlideShare uma empresa Scribd logo
1 de 60
Architect’s Guide to Designing Integrated
Multi-Product HA-DR-BC Solutions
John Sing, Executive Strategy, IBM       Session E10




                                     1
John Sing   •   31 years of experience with IBM in high end servers, storage, and
                software
                  – 2009 - Present: IBM Executive Strategy Consultant: IT Strategy and Planning, Enterprise
                    Large Scale Storage, Internet Scale Workloads and Data Center Design, Big Data Analytics,
                    HA/DR/BC
                  – 2002-2008: IBM IT Data Center Strategy, Large Scale Systems, Business Continuity,
                    HA/DR/BC, IBM Storage

                  – 1998-2001: IBM Storage Subsystems Group - Enterprise Storage Server Marketing
                    Manager, Planner for ESS Copy Services (FlashCopy, PPRC, XRC, Metro Mirror, Global
                    Mirror)
                  – 1994-1998: IBM Hong Kong, IBM China Marketing Specialist for High-End Storage
                  – 1989-1994: IBM USA Systems Center Specialist for High-End S/390 processors
                  – 1982-1989: IBM USA Marketing Specialist for S/370, S/390 customers (including VSE and
                    VSE/ESA)


            •   singj@us.ibm.com

            •   IBM colleagues may access my webpage:
                  – http://snjgsa.ibm.com/~singj/



            •   You may follow my daily IT research blog
                  – http://www.delicious.com/atsf_arizona




                                                                        2
Agenda
    •   Understand today’s challenges and best
        practices
         –   for IT High Availability and IT Business Continuity


    •   What has changed? What is the same?

    •   Strategies for:
         –   Requirements, design, implementation


    •   Step by step approach
         –   Essential role of automation
         –   Accommodating petabyte scale
         –   Exploiting Cloud



                                  2012 Cloud
                                  deployment
                                    options



3
                                                                   3
Agenda


1. Solving Today’s HA-DR-BC Challenges
2. Guiding HA-DR-BC Principles to mitigate chaos
3. Traditional Workloads vs. Internet Scale Workloads
4. Master Vision and Best Practices Methodology




                                           4
Recovering today’s real-time massive streaming workflows is challenging




                                                                                                                              n                                  d




 Chart in public domain: IEEE Massive File Storage presentation, author: Bill Kramer, NCSA: http://storageconference.org/2010/Presentations/MSST/1.Kramer.pdf:




                                                                                                                            5
Today’s Data and Data Recovery Conundrum:




                                   6
Inter-
Many options, including many non-traditional alternatives for                 Disciplinary
user deployments, workload hosting, and recovery models

  Traditional alternatives:               •   Non-traditional alternatives:
                                               – The Cloud, the Developing World
  •   Other platforms

  •   Other vendors




       Illustrative Cloud examples only
           No endorsement is implied
                  or expressed




                                                            7
Finally, we have this ‘little’ problem regarding Mobile proliferation

                                                                           Clayton Christensen
                                                                         Harvard Business School

•   From IT standpoint, we are
    clearly seeing
    “consumerization of IT”

•   Key is to recognize and
    exploit hyper-pace reality
    of BYOD’s associated data

•   Not just the technology

•   Also the recovery model
    (“cloud), the business
    model, and the required
    ecosystem


                         http://en.wikipedia.org/wiki/Disruptive_innovation



                                                                                  8
So how do we affordably architect HA / BC / DR in 2012?




                                          9
What has remained the same?


(Continued good Guiding Principles that mitigate
              HA/DR/BC chaos)




 Storage Efficiency
                  Service Management Data Protection




                                           10
The Business Process is still the Recoverable Unit
Business



                            Business         Business      Business        Business      Business      Business       Business
                           process A        process B     process C       process D     process E     process F      process G




                                                                                          3. The loss of both
                                                                                                                                      db2
                                                                                          applications affects two
Application




                                                               Application 2
                                       http://xyz.xml                                     distinctly different
                                                                                               Web
                                                                                              Sphere
                                                                                          business processes
                                 MQseries
                                              2. The error impacts               management                           Application 3
                                                    Analytics
                 Application 1                the ability of two or
                                                     report                        reports               decision
                                              more applications to SQL                                    point
                                              share critical data
Infrastructure




                                                                                          IT Business Continuity
                                       1. An error occurs on
                                       a storage device that                              must recover at the
                                       correspondingly
                                       corrupts a database
                                                                                          business process
                                                                                          level

                                                                                                    11
Cloud does not change business process; still the recovery unit
Business




                            Business      Business       Business          Business      Business        Business      Business
                           process A     process B      process C         process D     process E       process F     process G




                                                                                              3. The loss of Cloud
                                                                                                                                       db2
                                                                                              output affects two
Application




                                                               Application 2
                                       http://xyz.xml                                         distinctly different
                                                                                                 Web
                                                                                                Sphere processes
                                                                                              business
                                 STOP
                                                                                 management                            Application 3
                                                  Analytics
                 Application 1                2. Cloud provider                    reports                 decision
                                                    report
                                              outage            SQL                                         point
Infrastructure




                                                                                      Cloud is simply another
                                                                                      deployment option
                                        1. Data input to the
                                        cloud                                         But doesn’t change HA/BC
                                                                                      fundamental approach


                                                                                                      12
When can Cloud recovery can provide extremely
                              fast time to project completion?
•   Where entire business process recoverable units can be out-sourced to Cloud provider

       –          Production example: Out-sourcing production, or backup/restore, or integrated, standalon, application to
                  a provider
       –          Cloud application-as-a-service (AaaS) example: Salesforce.com, etc.
    Business




                                Business        Business            Business           Business       Business        Business        Business
                               process A        process B           process C         process D       process E       process F      process G



                                                                                                                                                       db2
    Application




                                            http://xyz.xml                Application 2                      Web
                                                                                                            Sphere
                                     MQseries
                                                            Analytics                         management                               Application 3
                    Application 1                                                                                         decision
                                                             report             SQL             reports
                                                                                                                           point
    Technical




                                                                                                                     13
The trick to leveraging Cloud is:



Understanding that Cloud is simply another
  (albeit powerful) deployment choice



                    Good news:

Fundamental principles for HA/DR/BC haven’t changed

 It’s only the deployment options that have changed


                                      14
Still true: synergistic overlap of valid data protection techniques

                                           IT Data
                                          Protection




     1. High Availability         2. Continuous Operations             3. Disaster Recovery
     Fault-tolerant, failure-      Non-disruptive backups and       Protection against unplanned
      resistant streamlined     system maintenance coupled with       outages such as disasters
        infrastructure with          continuous availability of      through reliable, predictable
          affordable cost                  applications                        recovery
           foundation

   Protection of critical Business data            Operations continue after a disaster
   Recovery is predictable and reliable            Costs are predictable and manageable


                                                                  15
Four Stages of Data Center Efficiency: (pre-req’s for HA/BC/DR)




                                                                                                             April 2012




                                                                                          http://www-935.ibm.com/services/us/igs/smarterdatacenter.html
         http://public.dhe.ibm.com/common/ssi/ecm/en/rlw03007usen/RLW03007USEN.PDF


                                                                                     16
Telecom bandwidth
                                                                              still the major delimiter
  Still true: Timeline of an IT Recovery ==>                                   for any fast recovery




                                     Execute hardware, operating system,
          RPO        ?
                   Assess
                                          and data integrity recovery
                                                                                                                             Telecom Network




                         Management Control                                       Data

                                                                                                                                  Physical Facilities


                                                                                               Operating System




             Outage!
            Production   ☺                                                  Operations Staff
                                                                                                                          Network Staff



                                                                                                 Applications Staff




Recovery Point
  Objective
                                    Recovery Time Objective (RTO)
                                      of hardware data integrity
                                                                           RPO     Done? transaction
                                                                                   Application
                                                                                         integrity recovery
    (RPO)

How much data                                                                                  Applications
   must be
  recreated?                        Recovery Time Objective (RTO)
                                       of transaction integrity

                                                                                                                      Now we're done!



                                                                                       17
Still true: value of Automation for real-time failover ===>

          RPO        ?
                 Assess     HW
                                                                                      Telecom Network



                     Management Control
                                               Data

                                                                                           Physical Facilities


                                                             Operating System                                                     Value of
                                                                                                                                automation
             Outage!
            Production   ☺                Operations Staff                         Network Staff



                                                              Applications Staff




                     RTO                       Trans.
                                                                                                                 •Reliability
                                    RPO
                                               Recov.
                     H/W
                                                                                                                 •Repeatability
Recovery Point
                                             Applications                                                        •Scalability
  Objective
    (RPO)
                          RTO
                     trans. integrity
                                                                                                                 •Frequent Testing
How much data
   must be                                                     Now we're done!
  recreated?



                                                                                                                   18
Still true: Organize High Availability, Business Continuity Technologies
            Balancing recovery time objective with cost / value



                   Recovery from a disk image                                   Recovery from tape copy



                             BC Tier 7 – Add Server or Storage replication with end-to-end automated server
                             recovery
                                 BC Tier 6 – Add real-time continuous data replication, server or
                                 storage
                                       BC Tier 5 – Add Application/database integration to
                                       Backup/Restore
e u a V/ t s o C




                                               BC Tier 4 – Add Point in Time replication to
                                               Backup/Restore
                                                           BC Tier 3 – VTL, Data De-Dup, Remote vault
                                                                                        BC Tier 2 – Tape libraries +
                                                                                        Automation      BC Tier 1 – Restore
  l




                   15 Min.    1-4 Hr..   4 -8 Hr..   8-12 Hr..   12-16 Hr..   24 Hr..      Days         from Tape
                                    Recovery Time Objective                   (guidelines only)




                                                                                                  19
Still true: Replication Technology Drives RPO

  For example:


            Wks    Days Hrs Mins Secs               Secs   Mins   Hrs Days Wks


                 Recovery Point                     Recovery Time


       Tape
       Backup      Periodic
                   Replication
                           Asynchronous
                           replication
                                       Synchronous
                                       replication / HA




                                                                  20
Still true: Recovery Automation Drives Recovery Time
    For example:

                Wks   Days Hrs Mins Secs       Secs Mins    Hrs Days Wks


                   Recovery Point              Recovery Time



                                               End to end
                                               automated     Storage
    Recovery Time includes:                   clustering    automation   Manual Tape
                                                                          Restore
       –   Fault detection
       –   Recovering data
       –   Bringing applications back online
       –   Network access


                                                            21
Still true: “ideal world” construct for IT High Availability and Business Continuity

 Business processes drive strategies and they are integral to the Continuity of Business Operations. A company cannot
 be resilient without having strategies for alternate workspace, staff members, call centers and communications
 channels.

Business Prioritization                                      Integration into IT                              Manage

                      Awareness, Regular Validation, Change Management, Quarterly Management Briefings                                       Resilience
                                                                                                                                              Program
                                                                                                                                            Management




                                                                                                                                   e
                                                                                                                           ery ted
                                                                                                                               Tim
                                                         Cap rrent
                                                                lity
                         ss                 RTO/RPO
                      ine ct                                                                                                                   m




                                                                                                                     Re stima
                    s                                                                                                                        ra
                                                            abi
                  bu pa is                                 Cu              ra
                                                                              m
                                                                                                                                         og ation
                    im lys                                               og ign                                                        pr id
                                                                       Pr es




                                                                                                                       cov
     risk               a                    program                              Strategy                                                 l
                     an




                                                                                                                        E
 assessment                                 assessment                   D         Design
                                                                                                     Implement                          va

                                       of
                         s
                rea itie




                                    ts       • Maturity
                                  ac e                           crisis team
                                                                                     High Availability        1.    People
                                mp utag
              Th abil




                                               Model
                   ts




                                                                                         design
                               I                                                                              2.    Processes
                                  O
         and ulner




                                             • Measure                                                        3.    Plans
                                                                  business
                                               ROI                                      High Availability
                                                                                                              4.    Strategies
                                                                 resumption
          s, V




                                                                                            Servers
                                             • Roadmap                                                        5.    Networks
                                                                  disaster
           k




                                               for                                                            6.    Platforms
       Ris




                                                                                         Storage, Data
                                               Program            recovery                Replication         7.    Facilities
                                                                    high
                                                                                         Database and
                                                                 availability           Software design

                                                                                                                   Source: IBM STG, IBM Global Services


                                                                                                         22
The 2012 Bottom line: (IT Business Continuity Planning Steps)

 For today’s real world environment……….


Need faster way than even this simplified 2007 version:

                                                                                                                                                                         2012 key #1:
  1.          Collect information for this “ideal” process?
                         i.e. how to streamline prioritization
                                                                                                                                                                         need a basic
  2.          Vulnerability, risk assessment, scope
    Business Prioritization
                                  Awareness, Regular Validation, Change Management, Quarterly Management Briefings
                                                                            Integration into IT                                    Manage                           Data Strategy
                                                                                                                                                                     Resilience
                                                                                                                                                                      Program
                                                                                                                                                                    Management


  3.          Define BC targets based on scope




                                                                                                                                                        e
                                                                                                                                                 ery ted
                                                                          Cap rrent




                                                                                                                                                     Tim
                                                                                lity


                                       s                   RTO/RPO
                                    es
                                 sin t




                                                                                                                                            Re stima
                                                                             abi




                                                                                                                                                                     m
                               bu pac is                                                        m                                                                 ra
                                                                                                                                                              og ion
                                                                            Cu




                                 im lys                                                      ra
                                                                                           og n                                                             pr idat




                                                                                                                                              cov
                                                                                         Pr esig

  4.          Solution option design and evaluation
                                    a




                                                                                                                                               E
           risk                   an                       program                                    Strategy                                                  l
                                                                                           D                                 Implement                       va
       assessment                                         assessment                                   Design
                                                                                                                                                                         2012 key #2:
                                                      f
                                                   so     • Maturity                                                                1.   People
                                                ct                                                       high availability
                                    s




  5.          Recommend solutions and products
                                              pa age                                   crisis team
                            rea itie




                                                            Model                                                                   2.   Processes
                                            Im ut
                                                                                                                                                                 Workload type
                                                                                                             design
                                               O          • Measure
                          Th abil




                                                                                                                                    3.   Plans
                               ts




                                                            ROI
                     and ulner




                                                                                        business                                    4.   Strategies
                                                          • Roadmap                                         High Availability
                                                                                       resumption                                   5.   Networks
                                                                                                                Servers
                         V




                                                            for Program                                                             6.   Platforms

  6.          Recommend strategy and roadmap
                     ks,




                                                                                                                                    7.   Facilities
                 Ris




                                                                                        disaster                 Data
                                                                                        recovery               Replication

                                                                                          high
                                                                                       availability          Database and
                                                                                                            Software design




                                                                                                                                                  23
Streamlined BC Actions 2005 version
    Input                                                                                Output
                                                                          Scope, Resource Business
      Business processes, Key             1. Collect info for                 Impact
         Perf. Indicators, IT             prioritization                  Component effect on business
                                                                              processes
         inventory

                                                                       Defined vulnerabilities
     List of vulnerabilities            2. Vulnerability / Risk
                                        Assessment
                                                                          Defined BC baseline
     Existing BC capability, KPIs,                                            targets,
                                        3. Define desired HA/BC               architecture,
         targets, and success rate
                                        targets based on scope                decision and
                                                                              success criteria

     Technologies and solution
                                        4. Solution design and
        options
                                        evaluation                  Business process segments
                                                                    and solutions

    Generic solutions that meet         5. Recommend                   Recommended IBM
       criteria                         solutions and products         Solutions and benefits
Budget, major project
   milestones, resource              6. Recommend strategy and    Baseline Bus. Cont. strategy,
   availability, business                                         roadmap, benefits, challenges,
                                         roadmap
                                                                  financial implications and
   process priority
                                                                  justification
                                                                  24
Streamlined BC Actions             2012 version
       Input                                                                               Output
                                                                           Scope, Resource Business
      Business processes, Key             1. Collect info for                  Impact
         Perf. Indicators, IT             prioritization                   Component effect on business
                                                                               processes
         inventory                     Do basic HA/DR
     List of vulnerabilities         Data Strategy
                                      2. Vulnerability / Risk
                                                                        Defined vulnerabilities

                                        Assessment
                                                                           Defined BC baseline
     Existing BC capability, KPIs,                                             targets,
                                        3. Define desired HA/BC                architecture,
         targets, and success rate
                                        targets based on scope                 decision and
                                                                               success criteria

     Technologies and solution
                                        4. Solution design and
        options
                                        evaluation                   Business process segments
                                                                     and solutions
                                           Exploit
    Generic solutions that meet   Workload Type
                                    5. Recommend                        Recommended IBM
       criteria                         solutions and products          Solutions and benefits
Budget, major project
   milestones, resource              6. Recommend strategy and     Baseline Bus. Cont. strategy,
   availability, business                                          roadmap, benefits, challenges,
                                         roadmap
                                                                   financial implications and
   process priority
                                                                   justification
                                                                   25
How do we get there in 2012?

Bottom line #1: have a basic Data Strategy

  Bottom line #2: Exploit Workload type




  Storage Efficiency
                   Service Management Data Protection



                                            26
i.e. #1:    It’s all about the




           Data
 Now, what do I mean by that?


                                27
What is a basic Data Strategy?                         Specify data usage over it’s lifespan



                                    Applications   Information             Information
                                    create data      and data        Archive / Retain / Delete
                                                   Management
      Frequency of Access and Use




                                                                                             Time



28
                                                                            28
Data strategy = collecting information, prioritizing, vulnerability/risk,
  scope
 Business processes drive strategies and they are integral to the Continuity of Business Operations. A company cannot
 be resilient without having strategies for alternate workspace, staff members, call centers and communications
 channels.


Business Prioritization                                       Integration into IT                              Manage

                         Awareness, Regular Validation, Change Management, Quarterly Management Briefings                                     Resilience
                                                                                                                                               Program
                                                                                                                                             Management




                                                                                                                                    e
                                                                                                                            ery ted
                                                                                                                                Tim
                                                          Cap rrent
                                                                 lity
                          ss                 RTO/RPO
                       ine ct                                                                                                                   m




                                                                                                                      Re stima
                      s Data                                                                                                                  ra

                                                             abi
                    bu pa is                                Cu              ra
                                                                               m
                                                                                                                                          og ation
                      im lys                                              og ign                                                        pr id
                                                                        Pr es




                                                                                                                        cov
     risk                a                    program                              Strategy                                                 l
                       an




                                                                                                                         E
 assessment           Strategy               assessment                   D         Design
                                                                                                      Implement                          va

                                        of
                            s
                   rea itie




                                     ts       • Maturity
                                   ac e                           crisis team
                                                                                      High Availability        1.    People
                                 mp utag
                 Th abil




                                                Model
                      ts




                                                                                          design
                                I                                                                              2.    Processes
                                   O
            and ulner




                                              • Measure                                                        3.    Plans
                                                                   business
                                                ROI                                      High Availability
                                                                                                               4.    Strategies
                                                                  resumption
                V




                                                                                             Servers
                                              • Roadmap                                                        5.    Networks
           k s,




                                                for                disaster                                    6.    Platforms
       Ri s




                                                                                          Storage, Data
                                                Program            recovery                Replication         7.    Facilities
                                                                     high
                                                                                          Database and
                                                                  availability           Software design

                                                                                                                    Source: IBM STG, IBM Global Services



                                                                                                          29
Data Strategy Defined

Data Strategy: relationship to Business, IT Strategies

       Business Strategy                     IT Strategy
                                                                                  Business Strategies
                 Business                      Technology
                  Scope                             Scope

                                                                                         IT Strategy
       Distinct             Business     System                 IT
    Competencies          Governance   Competencies         Governance                Data Strategy

                                                                                Enterprise IT Architecture


   Organization, Infrastructure,           IT Infrastructure
                 Process                   And processes

                                                      IT                             IT Infrastructure
                  Process                     Infrastructure


                                                                                 People            Data
        Skills               Tools      Processes             Skills
                                                                                Process
                                                                                                 Technology
                                                                                Structure




                                                                                30
Data Strategy Defined


The role of the basic “Data Strategy” for HA / BC purposes
 •   Define major data types “good enough”
      –   i.e. by major application, by business line….
                                                                                    Business Strategies
      –   An ongoing journey
                                                           You have to
 •   For each data type:                                  know your data                   IT Strategy
      –   Usage
      –   Performance and measurement                                                   Data Strategy
      –   Security
      –   Availability
                                                                                  Enterprise IT Architecture
      –   Criticality
      –   Organizational role                               And have a
      –   Who manages                                      basic strategy
      –   What standards for this data                          for it
             • What type storage deployed on
             • What database
             • What virtualization
                                                                                       IT Infrastructure

 •   Be pragmatic                                                                  People             Data
      –   Create a basic, “good enough” data strategy for HA/BC purposes          Process
                                                                                                    Technology
                                                                                  Structure
 •   Acquire tools that help you know your data



                                                                                  31
Here’s the major difference for 2012:
There are two major types of workloads:
                           Traditional IT                      Internet Scale
                                                               Workloads
HA, Business Continuity,   HA/DR/BC can be done “Agnostic /    HA/DR/BC must be “designed
Disaster Recovery          after the fact” using replication   into software stack from the
Characteristics                                                beginning”

Data Strategy              Use traditional tools/concepts to   Proven Open Source toolset
                           understand / know data              to implement failure
                           Storage/server virtualization and   tolerance and redundancy in
                           pooling                             the application stack
Automation                 End to end automation of server /   End to end automation of the
                           storage virtualization              application software stack
                                                               providing failure tolerance

Commonality                Apply master vision and lessons     Apply master vision and
                           learned from internet scale data    lessons learned from internet
                           centers                             scale data centers



                                                                  32
Choices for high availability and replication architectures

Production Site



 Geographic Load Balancer
                            Site Load        Web         Application / DB      Server
                            Balancer     Server Clusters Server Clusters      Clusters        Disk


                                 Workload         Application             Server         Storage
                                 balancer        or database          Replication         Replic.
                                                  Replication




 Local
 backup    Geographic
          Load Balancer       Site
                                            Web         Application / DB     Server
                             Load
                            Balancer    Server Clusters Server Clusters     Clusters
                                                                                          PIT Image,
 Other Site(s)                                                                            Tape B/U

                                                                    33
Comparing IT BC architectural methods
                 Production Site


                   Geographic Load Balancer
                                                    Site Load            Web           Application / DB            Server
                                                    Balancer         Server Clusters   Server Clusters             Clusters         Storage

                                                          Workload             Application /                  Server             Stor
                                                          Balancer           DB Replication               Replication          Replic.



                   Local             Geographic
                   Backup           Load Balancer     Site           Web               Application / DB         Server
                                                     Load        Server Clusters       Server Clusters          Clusters
                                                    Balancer                                                                   Replication,
                            Multiple Site(s)                                                                                   PiT Image,
                                                                                                                               Tape

• Application / database / file system replication / workload balancer
      –
                                                                                                                              File system,
          Typically requires the least bandwidth
      –   May be required if the scale of storage is very large (i.e. internet scale)                                          DB, Applic.
      –   Span of consistency is that application, database or file system only
      –
                                                                                                                                  Aware
          Well understood by database, application, file system administrators
      –   Can be more complex implementation, must implement for each application

• Replication – Server (traditional IT)
      –   Well understood by operating systems administrators
      –   Storage and application independent, uses server cycles
      –   Span of recovery limited to that server platform

•   Replication – Storage (traditional IT)
      –   Can provide common recovery across multiple application stacks and multiple                                                         File system,
          server platforms
      –   Usually requires more bandwidth                                                                                                      DB, Applic.
      –   Requires storage replication skill set
                                                                                                                                                Agnostic

                                                                                                                    34
Principles for Internet Scale Workloads




                                          35
Internet Scale Workload Characteristics - 1

•     Embarrassingly parallel Internet workload
       –   Immense data sets, but relatively independent records being processed
             • Example: billions of web pages, billions of log / cookie / click entries
       –   Web requests from different users essentially independent of each over
             • Creating natural units of data partitioning and concurrency
             • Lends itself well to cluster-level scheduling / load-balancing
       –   Independence = peak server performance not important                            i.e. Very low
                                                                                           inter-process
       –   What’s important is aggregate throughput of 100,000s of servers                communication




•     Workload Churn
       –   Well-defined, stable high level API’s (i.e. simple URLs)
       –   Software release cycles on the order of every couple of weeks
             • Means Google’s entire core of search services rewritten in 2 years
       –   Great for rapid innovation
             • Expect significant software re-writes to fix problems ongoing basis
       –   New products hyper-frequently emerge
             • Often with workload-altering characteristics, example = YouTube




                                                                                   36
Internet Scale Workload Characteristics - 2
 •   Platform Homogeneity
      –   Single company owns, has technical capability, runs entire platform end-to-
          end including an ecosystem
      –   Most Web applications more homogeneous than traditional IT
      –   With immense number of independent worldwide users
                                                                                                                             1% - 2% of all
                                                                                                                           Internet requests
                                                                                                                                  fail*
 •   Fault-free operation via application middleware
      –   Some type of failure every few hours, including software bugs
      –   All hidden from users by fault-tolerant middleware                                                        Users can’t tell difference
                                                                                                                   between Internet down and
      –   Means hardware, software doesn’t have to be perfect                                                          your system down

                                                                                                                   Hence 99% good enough
 •   Immense scale:
      –   Workload can’t be held within 1 server, or within max size tightly-clustered
          memory-shared SMP
      –   Requires clusters of 1000s, 10000s of servers with corresponding PBs storage,
          network, power, cooling, software
      –   Scale of compute power also makes possible apps such as Google Maps, Google
          Translate, Amazon Web Services EC2, Facebook, etc.


            *The Data Center as a Computer: Introduction to Warehouse Scale Computing, p.81 Barroso, Holzle
            http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006




                                                                                                              37
IT architecture at internet scale
                     •   Internet scale architectures fundamental assumptions:
    Criteria:
                          –   Distributed aggregation of data


  Cost                    –   High Availability, failure tolerance functionality is in
                              software on the server

                          –   Time to Market is everything
                                • Breakage = “OK” if I can insulate that from user
                          –   Affordability is everything
                          –   Use open source software where-ever possible
Extreme:
                          –   Expect that something somewhere in infrastructure will
- Scale                       always be broken
- Parallelism
- Performance             –   Infrastructure is designed top-to-bottom to address this
- Real time
-Time to Market
                     •   All other criteria are driven off of these



                                                                 38
For Internet Scale workloads, Open Source based
          internet-scale software stack

Example shown is the 2003-2008 Google version:



    1. Google File System Architecture – GFS II

    2. Google Database - Bigtable

    3. Google Computation - MapReduce

    4. Google Scheduling - GWQ


                              Reliability, redundancy all in
  The OS or HW doesn’t do       the “application stack”
   any of the redundancy



                              39
Internet-scale
                                                                Each red block is an
HA/DR/BC                                                       inexpensive server =
                                      IT infrastructure        plenty of power for its
                                                                 portion of workflow


   For

Internet




                                                                                    Your customers
 Scale
            Input from the Internet

Workloads




                                                          40
Warehouse Scale Computer programmer productivity framework example

          •   Hadoop                                     •    Flume
               –     Overall name of software stack             –    Populate Hadoop with data
          •   HDFS                                       •    Oozie
               –     Hadoop Distributed File System             –    Workflow processing system
          •   MapReduce                                  •    Whirr
               –     Software compute framework                 –    Libraries to spin up Hadoop on
                       • Map = queries                               Amazon EC2, Rackspace, etc.
                       • Reduce=aggregates answers •          Avro
          •   Hive                                              –    Data serialization
               –     Hadoop-based data warehouse         •    Mahout
          •   Pig                                               –    Data mining
               –     Hadoop-based language               •    Sqoop
          •   Hbase                                             –    Connectivity to non-Hadoop
               –                                                     data stores
                     Non-relationship database fast
                     lookups                             •    BigTop
                                                                –    Packaging / interop of all
                                                                     Hadoop components




               http://wikibon.org/wiki/v/Big_Data:_Hadoop%2C_Business_Analytics_and_Beyond



                                                                                41
Summary - two major types of approaches, depending
on workload type:
                           Traditional IT                       Internet Scale
                                                                Workloads
HA, Business Continuity,   HA/DR/BC can be done “Agnostic /     HA/DR/BC must be “designed
Disaster Recovery          after the fact” using replication    into software stack from the
Characteristics                                                 beginning”

Data Strategy              Use traditional tools/conceptsw to   Proven Open Source toolset
                           understand / know data               to implement failure
                           Storage/server virtualization and    tolerance and redundancy in
                           pooling                              the application stack
Automation                 End to end automation of server /    End to end automation of the
                           storage virtualization               application software stack
                                                                providing failure tolerance

Commonality                Apply master vision and lessons      Apply master vision and
                           learned from internet scale data     lessons learned from internet
                           centers                              scale data centers



                                                                   42
Principles for Architecting IT HA / DR / Business Continuity




                                              43
Key strategy: segment data into logical storage pools by appropriate Data Protection
 characteristics (animated chart)
  Mission
  Critical
                              •   Continuous Availability (CA) – E2E automation enhances RDR
                                      –       RTO = near continuous, RPO = small as possible (Tier 7)
                                      –       Priority = uptime, with high value justification

                                  •   Rapid Data Recovery (RDR) – enhance backup/restore
                                          –    For data that requires it
                                          –    RTO = minutes, to (approx. range): 2 to 6 hours
                                          –    BC Tiers 6, 4
                                          –    Balanced priorities = Uptime and cost/value




                                          •     Backup/Restore (B/R) – assure efficient foundation
                                                   –    Standardize base backup/restore foundation
                                                   –    Provide universal 24 hour - 12 hour (approx) recovery capability
                                                   –    Address requirements for archival, compliance, green energy
                                                   –    Priority = cost



Lower         Enabled by                                   Know and categorize your data -
cost         virtualization
                                                   Provides foundation for affordable data protection


                                                                                                 44
Virtualization is fundamental to addressing today’s IT diversity




    Virtuali
    zation




                                                  45
Consolidated virtualized systems become the Recoverable
                                                                                      Virtuali
Units for IT Business Continuity                                                      zation

  Virtualized IT infrastructure Business Processes




  Virtualized systems become the resource pools that enable the recoverability




                                                                                 46
High Availability, Business Continuity Step by Step virtualization journey
             Balancing recovery time objective with cost / value



                    Recovery from a disk image                                   Recovery from tape copy



                              BC Tier 7 – Add Server or Storage replication with end-to-end automated server
                              recovery
                                  BC Tier 6 – Add real-time continuous data replication, server or
                                  storage
                                        BC Tier 5 – Add Application/database integration to
                                        Backup/Restore
 e u a V/ t s o C




                                                BC Tier 4 – Add Point in Time replication to
                                                Backup/Restore
                                                            BC Tier 3 – VTL, Data De-Dup, Remote vault
                                                                                         BC Tier 2 – Tape libraries +
                                                                                         Automation      BC Tier 1 – Restore
   l




                    15 Min.    1-4 Hr..   4 -8 Hr..   8-12 Hr..   12-16 Hr..   24 Hr..      Days         from Tape
Storage pools                                Recovery Time Objective
                                                                                                  Foundation

                                                                                                47
Storage
                                           Pools                     Add automated failover to
Apply appropriate server,                                               replicated storage
storage technology

                                          Real Time replication                                    Real-time
                                          (storage or server or                                    replication
                                                software)




                                         Periodic PiT replication:                                Point in time
                                              -File System
                                           - Point in Time Disk
                                        - VTL to VTL with Dedup
                                                                                                   Removable
                                                                                                     media


                                      - Foundation backup/restore
                                   - Physical or electronic transport




          PetaByte          Petabyte unstructured, due to usage and            Petabyte
         Unstructured              large scale, typically uses                Unstructured
                             application level intelligent redundancy                            File, application, or
                                     failure toleration design                                       disk-to-disk
                                                                                                 periodic replication


                                                                           48
Methodology: Traditional IT HA / BC / DR in stages, from bottom up


                                         •IBM ProtecTier
                                         •IBM Virtual Tape Library
                                         •IBM Tivoli Storage                                 •VTL, de-dup,
                                         Manager Backup/restore                              remote replication
                                                                                             at tape level

                       SAN                                              SAN




              Disk               VTL/De-Dup                                VTL/De-Dup


                                                                                        •IBM FlashCopy, SnapShot
                                                                                        •IBM XIV, SVC, DS, SONAS
                                                                                        •IBM Tivoli Storage
                                                                                        Productivity Center 5.1
Cost




                                          Add: Point-in-time Copy, disk to disk, Tiered Storage (Tier 4)
                                                Foundation: electronic vaulting, automation, tape lib (Tier 3)
                                                        Foundation: standardized, automated tape backup (Tier 2, 1)
       Recovery Time Objective


                                                                            49
Methodology: traditional IT HA / BC / DR in stages, from bottom up
                                                                             •Server virtualization
                     •Tivoli FlashCopy
                     Manager

                                                         End to end
                                                         Automated
                                         Application      Failover:                  Application           Dynamic
                                                           Server                    integration
                                         integration
                                                          Storage
                                                        Applications                                     •VMWare
                                                                                                         •PowerHA on p
                        SAN                                                      SAN


                                                                                                      If storage:
                                                                                                      •Metro Mirror, Global
                                          VTL/De-Dup                                   VTL/De-Dup
              Disk            Data                                            Data VTL/De-Dup         Mirror, Hitachi UR
                           replication                                     replication                •XIV, SVC, DS, other
                                                                                                      storage
                                                                                                      •TPC 5.1

                                         End to end automated site failover servers, storage, applications (Tier 7)
                                            Consolidate and implement real time data availability (Tier 6)
Cost




                                               Automate applications, database for replication and automation (Tier 5)
                                                   Add: Point-in-time Copy, disk to disk for backup/restore (Tier 4)
                                                         Foundation: electronic vaulting, automation, tape lib (Tier 3)
                                                                 Foundation: standardized, automated tape backup (Tier 2, 1)
       Recovery Time Objective


                                                                                      50
Pay-per-Usage
                                                                                                                  • Supporting compute-
                                  Persistent Storage




                                                                                                                                            multi-tenancy model
                                   Compute Cloud




                                                                                                                                          • Finer granularity in
                                                                                                                    centric workloads
                                                                                  User




                                                                                                                                                                   • Provider-owned
                                                                                    C
                                                                                User



                                                                                           Public Cloud
                                                                                 E




                                                               4
                                                                                             Services




                                                                   Enterprise A
                                                                           User
                                                                            B




                                                                        Enterprise B
                                                                      User




                                                                                    Enterprise C
                                                                        D




                                                                                                                                                                     assets
                                                                    User
                                                                     A
                                                               5




                                                                           Shared Cloud
                                                                             Services
                                                                                                                                                                                      51
                                                            • Standardized, multi-
                                                              tenant service
                                                            • Pay-per-usage
                                                                                         Operated or
                                                              model withCo-located
                                                              provider-owned
                                                              assets
                                                           2                                              3    Enterprise
                                                                Enterprise
Technology Deployments in Cloud




                                                                Data Center
                                                                 Managed
                                                               Private Cloud                                  Hosted Private
                                                                                                                  Cloud
                                                               Co-lo operated                                 Co-lo owned and
                                                                                                              Co-lo owned and
                                                                                                                  operated
                                                                                                                 operated
                                                           • Consumption models including client-
                                                             owned and provider-owned assets
                                                           • Delivery options including client premise
                                                             & hosted
                                                           • Strategic Outsourcing clients with
                                                             standardized services
                                           Private Cloud




                                                                                                                  • Client-managed




                                                                                                                                            implementation
                                                                   Data Center
                                                                   Enterprise


                                                                                 Private
                                                                                 Cloud




                                                                                                                                          • Internal or


                                                                                                                                            services
                                                                                                                                            partner
                                                                                                                    cloud
                                                               1
Cloud as remote site deployment options

                                      Real Time replication
                                      (storage or server or
                                            software)

                                                                                         Recovery
Production
                                                                                            in
                                     Periodic PiT replication:                            Cloud
                                          -File System
                                       - Point in Time Disk
                                    - VTL to VTL with Dedup




                                   - Point in Time Copies
                              - Physical or electronic transport




        PetaByte            Petabyte level storage typically               Petabyte
       Unstructured   uses intelligent file or application replication    Unstructured
                          due to large scale, usage patterns



                                                                         52
Virtualized                          Automated
                                       Storage                              failover
Data strategy
remote cloud

                                       Real Time replication                               Real-time
                                       (storage or server or                               replication
                                             software)




                                      Periodic PiT replication:                            Point in time
                                           -File System
                                        - Point in Time Disk
                                     - VTL to VTL with Dedup
                                                                                            Removable
                                                                                              media


                                    - Point in Time Copies
                               - Physical or electronic transport
                                                                                            Disk-to-disk
                                                                                             replication



         PetaByte            Petabyte level storage typically                Petabyte
        Unstructured   uses intelligent file or application replication     Unstructured
                           due to large scale, usage patterns


                                                                          53
Local Cloud deployment from
data standpoint




         PetaByte
        Unstructured


                              54
Cloud provider
responsibility
for HA
and BC
                                       Real Time replication
                                       (storage or server or
                                             software)

                                                                                          Recovery
    Your
                                                                                             By
 Production                           Periodic PiT replication:                             Cloud
     In                                    -File System
                                        - Point in Time Disk                               Provider
   Cloud
                                     - VTL to VTL with Dedup




                                    - Point in Time Copies
                               - Physical or electronic transport




         PetaByte            Petabyte level storage typically               Petabyte
        Unstructured   uses intelligent file or application replication    Unstructured
                           due to large scale, usage patterns


                                                                          55
Today’s world: High Availability, Business Continuity
                                                                                                              Cloud
is a Step by Step data strategy / workload journey                                                        deployment
            Balancing recovery time objective with cost / value                                            if needed

                   Recovery from a disk image                                   Recovery from tape copy



                             BC Tier 7 – Add Server or Storage replication with end-to-end automated server
                             recovery
                                 BC Tier 6 – Add real-time continuous data replication, server or
                                 storage
                                       BC Tier 5 – Add Application/database integration to
                                       Backup/Restore
e u a V/ t s o C




                                               BC Tier 4 – Add Point in Time replication to
                                               Backup/Restore
                                                           BC Tier 3 – VTL, Data De-Dup, Remote vault
                                                                                        BC Tier 2 – Tape libraries +
                                                                                        Automation      BC Tier 1 – Restore
  l




                   15 Min.    1-4 Hr..   4 -8 Hr..   8-12 Hr..   12-16 Hr..   24 Hr..      Days         from Tape
Data Strategy                                                                  Recovery Time Objective

                                                                   Workload Types
                                                                                               56
Step by Step Virtualization, High Availability,
                                                                                                              Cloud
Business Continuity data strategy                                                                         deployment
            Balancing recovery time objective with cost / value                                            if needed

                   Recovery from a disk image                                   Recovery from tape copy



                             BC Tier 7 – Add Server or Storage Availability end-to-end automated server
                                              Continuous replication with
                             recovery
                                 BC Tier 6 – Add real-time continuous data replication, server or
                                 storage
                                                           Rapid Data Recovery
                                       BC Tier 5 – Add Application/database integration to
                                       Backup/Restore
e u a V/ t s o C




                                               BC Tier 4 – Add Point in Time replication to
                                               Backup/Restore
                                                           BC Tier 3 – VTL, Data De-Dup, Remote vault
                                                                                             Backup/Restore
                                                                                        BC Tier 2 – Tape libraries +
                                                                                        Automation      BC Tier 1 – Restore
  l




                   15 Min.    1-4 Hr..   4 -8 Hr..   8-12 Hr..   12-16 Hr..   24 Hr..      Days         from Tape
Data Strategy                             Workload types                                Recovery Time Objective


                                                                                               57
Summary – IT High Availability / Business Continuity Best Practices 2012
Continuous               Implement BC Tier 7 – Standardize use of Continuous
Availability                 Availability automated Failover


                     Implement Tier 6 – Standardize high volume
                         data replication method
 Rapid
  Data               I
Recovery

                    Implement Tier 4 – Standardize use of disk to disk and
                        Point in Time disk copy


 Backup /            Implement Tier 3 – Consolidate and standardize
 Restore                  Backup/Restore methods. Implement tape VTL, data de-dup,
                          Server / Storage Virtualization / Mgmt tools, basic automation

 Production
                Backup/Restore Tier 1, 2                    Backup/Restore Tier 1, 2
                   Foundation:                                 replicated foundation:
                     Storage, server virtualization              SAN and server
                         and consolidation                           virtualization and
                     Understand my data                              consolidation
                     Define scope of recovery                          Implement remote
Data strategy        S                                                     sites (Tier 1, 2)   Recovery
                                             Workload types
                                                                                   58
Summary
•    Understand today’s best practices
      –                                                                   Data     Workload
          for IT High Availability and IT Business Continuity
                                                                        Strategy    types
•    What has changed? What is the same?
      – Principles for requirements = no change
          • Data Strategy
      – Deployment for true internet scale wkloads:
          • Application level redundancy


•    Strategies for:
      –   Requirements, design, implementation
      –   In-house vs. out-sourcing
                                                                   Cloud
                                                                deployment
•    Step by step approach                                        options
      –   Automation, virtualization essential
      –   Segment workloads traditional vs. petabyte scale
      –   Exploiting Cloud



59
                                                                             59
60

Mais conteúdo relacionado

Mais de John Sing

2015 IT Roadmap_Driving_Business_Success_v31
2015 IT Roadmap_Driving_Business_Success_v312015 IT Roadmap_Driving_Business_Success_v31
2015 IT Roadmap_Driving_Business_Success_v31John Sing
 
C cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_sing
C cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_singC cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_sing
C cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_singJohn Sing
 
Tutorial on the McKinsey/Harvard "Customer Decision Journey" by John Sing
Tutorial on the McKinsey/Harvard "Customer Decision Journey" by John SingTutorial on the McKinsey/Harvard "Customer Decision Journey" by John Sing
Tutorial on the McKinsey/Harvard "Customer Decision Journey" by John SingJohn Sing
 
Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...
Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...
Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...John Sing
 
Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14John Sing
 
To_Infinity_and_Beyond_2012_Big_Data_Internet_Scale_Update_November_2012_v2_J...
To_Infinity_and_Beyond_2012_Big_Data_Internet_Scale_Update_November_2012_v2_J...To_Infinity_and_Beyond_2012_Big_Data_Internet_Scale_Update_November_2012_v2_J...
To_Infinity_and_Beyond_2012_Big_Data_Internet_Scale_Update_November_2012_v2_J...John Sing
 
To_Infinity_and_Beyond_Internet_Scale_Workloads_Data_Center_Design_v6
To_Infinity_and_Beyond_Internet_Scale_Workloads_Data_Center_Design_v6To_Infinity_and_Beyond_Internet_Scale_Workloads_Data_Center_Design_v6
To_Infinity_and_Beyond_Internet_Scale_Workloads_Data_Center_Design_v6John Sing
 

Mais de John Sing (7)

2015 IT Roadmap_Driving_Business_Success_v31
2015 IT Roadmap_Driving_Business_Success_v312015 IT Roadmap_Driving_Business_Success_v31
2015 IT Roadmap_Driving_Business_Success_v31
 
C cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_sing
C cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_singC cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_sing
C cloud organizational_impacts_big_data_on-prem_vs_off-premise_john_sing
 
Tutorial on the McKinsey/Harvard "Customer Decision Journey" by John Sing
Tutorial on the McKinsey/Harvard "Customer Decision Journey" by John SingTutorial on the McKinsey/Harvard "Customer Decision Journey" by John Sing
Tutorial on the McKinsey/Harvard "Customer Decision Journey" by John Sing
 
Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...
Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...
Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...
 
Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14
 
To_Infinity_and_Beyond_2012_Big_Data_Internet_Scale_Update_November_2012_v2_J...
To_Infinity_and_Beyond_2012_Big_Data_Internet_Scale_Update_November_2012_v2_J...To_Infinity_and_Beyond_2012_Big_Data_Internet_Scale_Update_November_2012_v2_J...
To_Infinity_and_Beyond_2012_Big_Data_Internet_Scale_Update_November_2012_v2_J...
 
To_Infinity_and_Beyond_Internet_Scale_Workloads_Data_Center_Design_v6
To_Infinity_and_Beyond_Internet_Scale_Workloads_Data_Center_Design_v6To_Infinity_and_Beyond_Internet_Scale_Workloads_Data_Center_Design_v6
To_Infinity_and_Beyond_Internet_Scale_Workloads_Data_Center_Design_v6
 

Último

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 

Último (20)

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 

2012_Architects_Guide_Designing_Integrated_Multi-Product_HA_DR_BC_Solutions_v2

  • 1. Architect’s Guide to Designing Integrated Multi-Product HA-DR-BC Solutions John Sing, Executive Strategy, IBM Session E10 1
  • 2. John Sing • 31 years of experience with IBM in high end servers, storage, and software – 2009 - Present: IBM Executive Strategy Consultant: IT Strategy and Planning, Enterprise Large Scale Storage, Internet Scale Workloads and Data Center Design, Big Data Analytics, HA/DR/BC – 2002-2008: IBM IT Data Center Strategy, Large Scale Systems, Business Continuity, HA/DR/BC, IBM Storage – 1998-2001: IBM Storage Subsystems Group - Enterprise Storage Server Marketing Manager, Planner for ESS Copy Services (FlashCopy, PPRC, XRC, Metro Mirror, Global Mirror) – 1994-1998: IBM Hong Kong, IBM China Marketing Specialist for High-End Storage – 1989-1994: IBM USA Systems Center Specialist for High-End S/390 processors – 1982-1989: IBM USA Marketing Specialist for S/370, S/390 customers (including VSE and VSE/ESA) • singj@us.ibm.com • IBM colleagues may access my webpage: – http://snjgsa.ibm.com/~singj/ • You may follow my daily IT research blog – http://www.delicious.com/atsf_arizona 2
  • 3. Agenda • Understand today’s challenges and best practices – for IT High Availability and IT Business Continuity • What has changed? What is the same? • Strategies for: – Requirements, design, implementation • Step by step approach – Essential role of automation – Accommodating petabyte scale – Exploiting Cloud 2012 Cloud deployment options 3 3
  • 4. Agenda 1. Solving Today’s HA-DR-BC Challenges 2. Guiding HA-DR-BC Principles to mitigate chaos 3. Traditional Workloads vs. Internet Scale Workloads 4. Master Vision and Best Practices Methodology 4
  • 5. Recovering today’s real-time massive streaming workflows is challenging n d Chart in public domain: IEEE Massive File Storage presentation, author: Bill Kramer, NCSA: http://storageconference.org/2010/Presentations/MSST/1.Kramer.pdf: 5
  • 6. Today’s Data and Data Recovery Conundrum: 6
  • 7. Inter- Many options, including many non-traditional alternatives for Disciplinary user deployments, workload hosting, and recovery models Traditional alternatives: • Non-traditional alternatives: – The Cloud, the Developing World • Other platforms • Other vendors Illustrative Cloud examples only No endorsement is implied or expressed 7
  • 8. Finally, we have this ‘little’ problem regarding Mobile proliferation Clayton Christensen Harvard Business School • From IT standpoint, we are clearly seeing “consumerization of IT” • Key is to recognize and exploit hyper-pace reality of BYOD’s associated data • Not just the technology • Also the recovery model (“cloud), the business model, and the required ecosystem http://en.wikipedia.org/wiki/Disruptive_innovation 8
  • 9. So how do we affordably architect HA / BC / DR in 2012? 9
  • 10. What has remained the same? (Continued good Guiding Principles that mitigate HA/DR/BC chaos) Storage Efficiency Service Management Data Protection 10
  • 11. The Business Process is still the Recoverable Unit Business Business Business Business Business Business Business Business process A process B process C process D process E process F process G 3. The loss of both db2 applications affects two Application Application 2 http://xyz.xml distinctly different Web Sphere business processes MQseries 2. The error impacts management Application 3 Analytics Application 1 the ability of two or report reports decision more applications to SQL point share critical data Infrastructure IT Business Continuity 1. An error occurs on a storage device that must recover at the correspondingly corrupts a database business process level 11
  • 12. Cloud does not change business process; still the recovery unit Business Business Business Business Business Business Business Business process A process B process C process D process E process F process G 3. The loss of Cloud db2 output affects two Application Application 2 http://xyz.xml distinctly different Web Sphere processes business STOP management Application 3 Analytics Application 1 2. Cloud provider reports decision report outage SQL point Infrastructure Cloud is simply another deployment option 1. Data input to the cloud But doesn’t change HA/BC fundamental approach 12
  • 13. When can Cloud recovery can provide extremely fast time to project completion? • Where entire business process recoverable units can be out-sourced to Cloud provider – Production example: Out-sourcing production, or backup/restore, or integrated, standalon, application to a provider – Cloud application-as-a-service (AaaS) example: Salesforce.com, etc. Business Business Business Business Business Business Business Business process A process B process C process D process E process F process G db2 Application http://xyz.xml Application 2 Web Sphere MQseries Analytics management Application 3 Application 1 decision report SQL reports point Technical 13
  • 14. The trick to leveraging Cloud is: Understanding that Cloud is simply another (albeit powerful) deployment choice Good news: Fundamental principles for HA/DR/BC haven’t changed It’s only the deployment options that have changed 14
  • 15. Still true: synergistic overlap of valid data protection techniques IT Data Protection 1. High Availability 2. Continuous Operations 3. Disaster Recovery Fault-tolerant, failure- Non-disruptive backups and Protection against unplanned resistant streamlined system maintenance coupled with outages such as disasters infrastructure with continuous availability of through reliable, predictable affordable cost applications recovery foundation Protection of critical Business data Operations continue after a disaster Recovery is predictable and reliable Costs are predictable and manageable 15
  • 16. Four Stages of Data Center Efficiency: (pre-req’s for HA/BC/DR) April 2012 http://www-935.ibm.com/services/us/igs/smarterdatacenter.html http://public.dhe.ibm.com/common/ssi/ecm/en/rlw03007usen/RLW03007USEN.PDF 16
  • 17. Telecom bandwidth still the major delimiter Still true: Timeline of an IT Recovery ==> for any fast recovery Execute hardware, operating system, RPO ? Assess and data integrity recovery Telecom Network Management Control Data Physical Facilities Operating System Outage! Production ☺ Operations Staff Network Staff Applications Staff Recovery Point Objective Recovery Time Objective (RTO) of hardware data integrity RPO Done? transaction Application integrity recovery (RPO) How much data Applications must be recreated? Recovery Time Objective (RTO) of transaction integrity Now we're done! 17
  • 18. Still true: value of Automation for real-time failover ===> RPO ? Assess HW Telecom Network Management Control Data Physical Facilities Operating System Value of automation Outage! Production ☺ Operations Staff Network Staff Applications Staff RTO Trans. •Reliability RPO Recov. H/W •Repeatability Recovery Point Applications •Scalability Objective (RPO) RTO trans. integrity •Frequent Testing How much data must be Now we're done! recreated? 18
  • 19. Still true: Organize High Availability, Business Continuity Technologies Balancing recovery time objective with cost / value Recovery from a disk image Recovery from tape copy BC Tier 7 – Add Server or Storage replication with end-to-end automated server recovery BC Tier 6 – Add real-time continuous data replication, server or storage BC Tier 5 – Add Application/database integration to Backup/Restore e u a V/ t s o C BC Tier 4 – Add Point in Time replication to Backup/Restore BC Tier 3 – VTL, Data De-Dup, Remote vault BC Tier 2 – Tape libraries + Automation BC Tier 1 – Restore l 15 Min. 1-4 Hr.. 4 -8 Hr.. 8-12 Hr.. 12-16 Hr.. 24 Hr.. Days from Tape Recovery Time Objective (guidelines only) 19
  • 20. Still true: Replication Technology Drives RPO For example: Wks Days Hrs Mins Secs Secs Mins Hrs Days Wks Recovery Point Recovery Time Tape Backup Periodic Replication Asynchronous replication Synchronous replication / HA 20
  • 21. Still true: Recovery Automation Drives Recovery Time For example: Wks Days Hrs Mins Secs Secs Mins Hrs Days Wks Recovery Point Recovery Time End to end automated Storage  Recovery Time includes: clustering automation Manual Tape Restore – Fault detection – Recovering data – Bringing applications back online – Network access 21
  • 22. Still true: “ideal world” construct for IT High Availability and Business Continuity Business processes drive strategies and they are integral to the Continuity of Business Operations. A company cannot be resilient without having strategies for alternate workspace, staff members, call centers and communications channels. Business Prioritization Integration into IT Manage Awareness, Regular Validation, Change Management, Quarterly Management Briefings Resilience Program Management e ery ted Tim Cap rrent lity ss RTO/RPO ine ct m Re stima s ra abi bu pa is Cu ra m og ation im lys og ign pr id Pr es cov risk a program Strategy l an E assessment assessment D Design Implement va of s rea itie ts • Maturity ac e crisis team High Availability 1. People mp utag Th abil Model ts design I 2. Processes O and ulner • Measure 3. Plans business ROI High Availability 4. Strategies resumption s, V Servers • Roadmap 5. Networks disaster k for 6. Platforms Ris Storage, Data Program recovery Replication 7. Facilities high Database and availability Software design Source: IBM STG, IBM Global Services 22
  • 23. The 2012 Bottom line: (IT Business Continuity Planning Steps) For today’s real world environment………. Need faster way than even this simplified 2007 version: 2012 key #1: 1. Collect information for this “ideal” process? i.e. how to streamline prioritization need a basic 2. Vulnerability, risk assessment, scope Business Prioritization Awareness, Regular Validation, Change Management, Quarterly Management Briefings Integration into IT Manage Data Strategy Resilience Program Management 3. Define BC targets based on scope e ery ted Cap rrent Tim lity s RTO/RPO es sin t Re stima abi m bu pac is m ra og ion Cu im lys ra og n pr idat cov Pr esig 4. Solution option design and evaluation a E risk an program Strategy l D Implement va assessment assessment Design 2012 key #2: f so • Maturity 1. People ct high availability s 5. Recommend solutions and products pa age crisis team rea itie Model 2. Processes Im ut Workload type design O • Measure Th abil 3. Plans ts ROI and ulner business 4. Strategies • Roadmap High Availability resumption 5. Networks Servers V for Program 6. Platforms 6. Recommend strategy and roadmap ks, 7. Facilities Ris disaster Data recovery Replication high availability Database and Software design 23
  • 24. Streamlined BC Actions 2005 version Input Output Scope, Resource Business Business processes, Key 1. Collect info for Impact Perf. Indicators, IT prioritization Component effect on business processes inventory Defined vulnerabilities List of vulnerabilities 2. Vulnerability / Risk Assessment Defined BC baseline Existing BC capability, KPIs, targets, 3. Define desired HA/BC architecture, targets, and success rate targets based on scope decision and success criteria Technologies and solution 4. Solution design and options evaluation Business process segments and solutions Generic solutions that meet 5. Recommend Recommended IBM criteria solutions and products Solutions and benefits Budget, major project milestones, resource 6. Recommend strategy and Baseline Bus. Cont. strategy, availability, business roadmap, benefits, challenges, roadmap financial implications and process priority justification 24
  • 25. Streamlined BC Actions 2012 version Input Output Scope, Resource Business Business processes, Key 1. Collect info for Impact Perf. Indicators, IT prioritization Component effect on business processes inventory Do basic HA/DR List of vulnerabilities Data Strategy 2. Vulnerability / Risk Defined vulnerabilities Assessment Defined BC baseline Existing BC capability, KPIs, targets, 3. Define desired HA/BC architecture, targets, and success rate targets based on scope decision and success criteria Technologies and solution 4. Solution design and options evaluation Business process segments and solutions Exploit Generic solutions that meet Workload Type 5. Recommend Recommended IBM criteria solutions and products Solutions and benefits Budget, major project milestones, resource 6. Recommend strategy and Baseline Bus. Cont. strategy, availability, business roadmap, benefits, challenges, roadmap financial implications and process priority justification 25
  • 26. How do we get there in 2012? Bottom line #1: have a basic Data Strategy Bottom line #2: Exploit Workload type Storage Efficiency Service Management Data Protection 26
  • 27. i.e. #1: It’s all about the Data Now, what do I mean by that? 27
  • 28. What is a basic Data Strategy? Specify data usage over it’s lifespan Applications Information Information create data and data Archive / Retain / Delete Management Frequency of Access and Use Time 28 28
  • 29. Data strategy = collecting information, prioritizing, vulnerability/risk, scope Business processes drive strategies and they are integral to the Continuity of Business Operations. A company cannot be resilient without having strategies for alternate workspace, staff members, call centers and communications channels. Business Prioritization Integration into IT Manage Awareness, Regular Validation, Change Management, Quarterly Management Briefings Resilience Program Management e ery ted Tim Cap rrent lity ss RTO/RPO ine ct m Re stima s Data ra abi bu pa is Cu ra m og ation im lys og ign pr id Pr es cov risk a program Strategy l an E assessment Strategy assessment D Design Implement va of s rea itie ts • Maturity ac e crisis team High Availability 1. People mp utag Th abil Model ts design I 2. Processes O and ulner • Measure 3. Plans business ROI High Availability 4. Strategies resumption V Servers • Roadmap 5. Networks k s, for disaster 6. Platforms Ri s Storage, Data Program recovery Replication 7. Facilities high Database and availability Software design Source: IBM STG, IBM Global Services 29
  • 30. Data Strategy Defined Data Strategy: relationship to Business, IT Strategies Business Strategy IT Strategy Business Strategies Business Technology Scope Scope IT Strategy Distinct Business System IT Competencies Governance Competencies Governance Data Strategy Enterprise IT Architecture Organization, Infrastructure, IT Infrastructure Process And processes IT IT Infrastructure Process Infrastructure People Data Skills Tools Processes Skills Process Technology Structure 30
  • 31. Data Strategy Defined The role of the basic “Data Strategy” for HA / BC purposes • Define major data types “good enough” – i.e. by major application, by business line…. Business Strategies – An ongoing journey You have to • For each data type: know your data IT Strategy – Usage – Performance and measurement Data Strategy – Security – Availability Enterprise IT Architecture – Criticality – Organizational role And have a – Who manages basic strategy – What standards for this data for it • What type storage deployed on • What database • What virtualization IT Infrastructure • Be pragmatic People Data – Create a basic, “good enough” data strategy for HA/BC purposes Process Technology Structure • Acquire tools that help you know your data 31
  • 32. Here’s the major difference for 2012: There are two major types of workloads: Traditional IT Internet Scale Workloads HA, Business Continuity, HA/DR/BC can be done “Agnostic / HA/DR/BC must be “designed Disaster Recovery after the fact” using replication into software stack from the Characteristics beginning” Data Strategy Use traditional tools/concepts to Proven Open Source toolset understand / know data to implement failure Storage/server virtualization and tolerance and redundancy in pooling the application stack Automation End to end automation of server / End to end automation of the storage virtualization application software stack providing failure tolerance Commonality Apply master vision and lessons Apply master vision and learned from internet scale data lessons learned from internet centers scale data centers 32
  • 33. Choices for high availability and replication architectures Production Site Geographic Load Balancer Site Load Web Application / DB Server Balancer Server Clusters Server Clusters Clusters Disk Workload Application Server Storage balancer or database Replication Replic. Replication Local backup Geographic Load Balancer Site Web Application / DB Server Load Balancer Server Clusters Server Clusters Clusters PIT Image, Other Site(s) Tape B/U 33
  • 34. Comparing IT BC architectural methods Production Site Geographic Load Balancer Site Load Web Application / DB Server Balancer Server Clusters Server Clusters Clusters Storage Workload Application / Server Stor Balancer DB Replication Replication Replic. Local Geographic Backup Load Balancer Site Web Application / DB Server Load Server Clusters Server Clusters Clusters Balancer Replication, Multiple Site(s) PiT Image, Tape • Application / database / file system replication / workload balancer – File system, Typically requires the least bandwidth – May be required if the scale of storage is very large (i.e. internet scale) DB, Applic. – Span of consistency is that application, database or file system only – Aware Well understood by database, application, file system administrators – Can be more complex implementation, must implement for each application • Replication – Server (traditional IT) – Well understood by operating systems administrators – Storage and application independent, uses server cycles – Span of recovery limited to that server platform • Replication – Storage (traditional IT) – Can provide common recovery across multiple application stacks and multiple File system, server platforms – Usually requires more bandwidth DB, Applic. – Requires storage replication skill set Agnostic 34
  • 35. Principles for Internet Scale Workloads 35
  • 36. Internet Scale Workload Characteristics - 1 • Embarrassingly parallel Internet workload – Immense data sets, but relatively independent records being processed • Example: billions of web pages, billions of log / cookie / click entries – Web requests from different users essentially independent of each over • Creating natural units of data partitioning and concurrency • Lends itself well to cluster-level scheduling / load-balancing – Independence = peak server performance not important i.e. Very low inter-process – What’s important is aggregate throughput of 100,000s of servers communication • Workload Churn – Well-defined, stable high level API’s (i.e. simple URLs) – Software release cycles on the order of every couple of weeks • Means Google’s entire core of search services rewritten in 2 years – Great for rapid innovation • Expect significant software re-writes to fix problems ongoing basis – New products hyper-frequently emerge • Often with workload-altering characteristics, example = YouTube 36
  • 37. Internet Scale Workload Characteristics - 2 • Platform Homogeneity – Single company owns, has technical capability, runs entire platform end-to- end including an ecosystem – Most Web applications more homogeneous than traditional IT – With immense number of independent worldwide users 1% - 2% of all Internet requests fail* • Fault-free operation via application middleware – Some type of failure every few hours, including software bugs – All hidden from users by fault-tolerant middleware Users can’t tell difference between Internet down and – Means hardware, software doesn’t have to be perfect your system down Hence 99% good enough • Immense scale: – Workload can’t be held within 1 server, or within max size tightly-clustered memory-shared SMP – Requires clusters of 1000s, 10000s of servers with corresponding PBs storage, network, power, cooling, software – Scale of compute power also makes possible apps such as Google Maps, Google Translate, Amazon Web Services EC2, Facebook, etc. *The Data Center as a Computer: Introduction to Warehouse Scale Computing, p.81 Barroso, Holzle http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006 37
  • 38. IT architecture at internet scale • Internet scale architectures fundamental assumptions: Criteria: – Distributed aggregation of data Cost – High Availability, failure tolerance functionality is in software on the server – Time to Market is everything • Breakage = “OK” if I can insulate that from user – Affordability is everything – Use open source software where-ever possible Extreme: – Expect that something somewhere in infrastructure will - Scale always be broken - Parallelism - Performance – Infrastructure is designed top-to-bottom to address this - Real time -Time to Market • All other criteria are driven off of these 38
  • 39. For Internet Scale workloads, Open Source based internet-scale software stack Example shown is the 2003-2008 Google version: 1. Google File System Architecture – GFS II 2. Google Database - Bigtable 3. Google Computation - MapReduce 4. Google Scheduling - GWQ Reliability, redundancy all in The OS or HW doesn’t do the “application stack” any of the redundancy 39
  • 40. Internet-scale Each red block is an HA/DR/BC inexpensive server = IT infrastructure plenty of power for its portion of workflow For Internet Your customers Scale Input from the Internet Workloads 40
  • 41. Warehouse Scale Computer programmer productivity framework example • Hadoop • Flume – Overall name of software stack – Populate Hadoop with data • HDFS • Oozie – Hadoop Distributed File System – Workflow processing system • MapReduce • Whirr – Software compute framework – Libraries to spin up Hadoop on • Map = queries Amazon EC2, Rackspace, etc. • Reduce=aggregates answers • Avro • Hive – Data serialization – Hadoop-based data warehouse • Mahout • Pig – Data mining – Hadoop-based language • Sqoop • Hbase – Connectivity to non-Hadoop – data stores Non-relationship database fast lookups • BigTop – Packaging / interop of all Hadoop components http://wikibon.org/wiki/v/Big_Data:_Hadoop%2C_Business_Analytics_and_Beyond 41
  • 42. Summary - two major types of approaches, depending on workload type: Traditional IT Internet Scale Workloads HA, Business Continuity, HA/DR/BC can be done “Agnostic / HA/DR/BC must be “designed Disaster Recovery after the fact” using replication into software stack from the Characteristics beginning” Data Strategy Use traditional tools/conceptsw to Proven Open Source toolset understand / know data to implement failure Storage/server virtualization and tolerance and redundancy in pooling the application stack Automation End to end automation of server / End to end automation of the storage virtualization application software stack providing failure tolerance Commonality Apply master vision and lessons Apply master vision and learned from internet scale data lessons learned from internet centers scale data centers 42
  • 43. Principles for Architecting IT HA / DR / Business Continuity 43
  • 44. Key strategy: segment data into logical storage pools by appropriate Data Protection characteristics (animated chart) Mission Critical • Continuous Availability (CA) – E2E automation enhances RDR – RTO = near continuous, RPO = small as possible (Tier 7) – Priority = uptime, with high value justification • Rapid Data Recovery (RDR) – enhance backup/restore – For data that requires it – RTO = minutes, to (approx. range): 2 to 6 hours – BC Tiers 6, 4 – Balanced priorities = Uptime and cost/value • Backup/Restore (B/R) – assure efficient foundation – Standardize base backup/restore foundation – Provide universal 24 hour - 12 hour (approx) recovery capability – Address requirements for archival, compliance, green energy – Priority = cost Lower Enabled by Know and categorize your data - cost virtualization Provides foundation for affordable data protection 44
  • 45. Virtualization is fundamental to addressing today’s IT diversity Virtuali zation 45
  • 46. Consolidated virtualized systems become the Recoverable Virtuali Units for IT Business Continuity zation Virtualized IT infrastructure Business Processes Virtualized systems become the resource pools that enable the recoverability 46
  • 47. High Availability, Business Continuity Step by Step virtualization journey Balancing recovery time objective with cost / value Recovery from a disk image Recovery from tape copy BC Tier 7 – Add Server or Storage replication with end-to-end automated server recovery BC Tier 6 – Add real-time continuous data replication, server or storage BC Tier 5 – Add Application/database integration to Backup/Restore e u a V/ t s o C BC Tier 4 – Add Point in Time replication to Backup/Restore BC Tier 3 – VTL, Data De-Dup, Remote vault BC Tier 2 – Tape libraries + Automation BC Tier 1 – Restore l 15 Min. 1-4 Hr.. 4 -8 Hr.. 8-12 Hr.. 12-16 Hr.. 24 Hr.. Days from Tape Storage pools Recovery Time Objective Foundation 47
  • 48. Storage Pools Add automated failover to Apply appropriate server, replicated storage storage technology Real Time replication Real-time (storage or server or replication software) Periodic PiT replication: Point in time -File System - Point in Time Disk - VTL to VTL with Dedup Removable media - Foundation backup/restore - Physical or electronic transport PetaByte Petabyte unstructured, due to usage and Petabyte Unstructured large scale, typically uses Unstructured application level intelligent redundancy File, application, or failure toleration design disk-to-disk periodic replication 48
  • 49. Methodology: Traditional IT HA / BC / DR in stages, from bottom up •IBM ProtecTier •IBM Virtual Tape Library •IBM Tivoli Storage •VTL, de-dup, Manager Backup/restore remote replication at tape level SAN SAN Disk VTL/De-Dup VTL/De-Dup •IBM FlashCopy, SnapShot •IBM XIV, SVC, DS, SONAS •IBM Tivoli Storage Productivity Center 5.1 Cost Add: Point-in-time Copy, disk to disk, Tiered Storage (Tier 4) Foundation: electronic vaulting, automation, tape lib (Tier 3) Foundation: standardized, automated tape backup (Tier 2, 1) Recovery Time Objective 49
  • 50. Methodology: traditional IT HA / BC / DR in stages, from bottom up •Server virtualization •Tivoli FlashCopy Manager End to end Automated Application Failover: Application Dynamic Server integration integration Storage Applications •VMWare •PowerHA on p SAN SAN If storage: •Metro Mirror, Global VTL/De-Dup VTL/De-Dup Disk Data Data VTL/De-Dup Mirror, Hitachi UR replication replication •XIV, SVC, DS, other storage •TPC 5.1 End to end automated site failover servers, storage, applications (Tier 7) Consolidate and implement real time data availability (Tier 6) Cost Automate applications, database for replication and automation (Tier 5) Add: Point-in-time Copy, disk to disk for backup/restore (Tier 4) Foundation: electronic vaulting, automation, tape lib (Tier 3) Foundation: standardized, automated tape backup (Tier 2, 1) Recovery Time Objective 50
  • 51. Pay-per-Usage • Supporting compute- Persistent Storage multi-tenancy model Compute Cloud • Finer granularity in centric workloads User • Provider-owned C User Public Cloud E 4 Services Enterprise A User B Enterprise B User Enterprise C D assets User A 5 Shared Cloud Services 51 • Standardized, multi- tenant service • Pay-per-usage Operated or model withCo-located provider-owned assets 2 3 Enterprise Enterprise Technology Deployments in Cloud Data Center Managed Private Cloud Hosted Private Cloud Co-lo operated Co-lo owned and Co-lo owned and operated operated • Consumption models including client- owned and provider-owned assets • Delivery options including client premise & hosted • Strategic Outsourcing clients with standardized services Private Cloud • Client-managed implementation Data Center Enterprise Private Cloud • Internal or services partner cloud 1
  • 52. Cloud as remote site deployment options Real Time replication (storage or server or software) Recovery Production in Periodic PiT replication: Cloud -File System - Point in Time Disk - VTL to VTL with Dedup - Point in Time Copies - Physical or electronic transport PetaByte Petabyte level storage typically Petabyte Unstructured uses intelligent file or application replication Unstructured due to large scale, usage patterns 52
  • 53. Virtualized Automated Storage failover Data strategy remote cloud Real Time replication Real-time (storage or server or replication software) Periodic PiT replication: Point in time -File System - Point in Time Disk - VTL to VTL with Dedup Removable media - Point in Time Copies - Physical or electronic transport Disk-to-disk replication PetaByte Petabyte level storage typically Petabyte Unstructured uses intelligent file or application replication Unstructured due to large scale, usage patterns 53
  • 54. Local Cloud deployment from data standpoint PetaByte Unstructured 54
  • 55. Cloud provider responsibility for HA and BC Real Time replication (storage or server or software) Recovery Your By Production Periodic PiT replication: Cloud In -File System - Point in Time Disk Provider Cloud - VTL to VTL with Dedup - Point in Time Copies - Physical or electronic transport PetaByte Petabyte level storage typically Petabyte Unstructured uses intelligent file or application replication Unstructured due to large scale, usage patterns 55
  • 56. Today’s world: High Availability, Business Continuity Cloud is a Step by Step data strategy / workload journey deployment Balancing recovery time objective with cost / value if needed Recovery from a disk image Recovery from tape copy BC Tier 7 – Add Server or Storage replication with end-to-end automated server recovery BC Tier 6 – Add real-time continuous data replication, server or storage BC Tier 5 – Add Application/database integration to Backup/Restore e u a V/ t s o C BC Tier 4 – Add Point in Time replication to Backup/Restore BC Tier 3 – VTL, Data De-Dup, Remote vault BC Tier 2 – Tape libraries + Automation BC Tier 1 – Restore l 15 Min. 1-4 Hr.. 4 -8 Hr.. 8-12 Hr.. 12-16 Hr.. 24 Hr.. Days from Tape Data Strategy Recovery Time Objective Workload Types 56
  • 57. Step by Step Virtualization, High Availability, Cloud Business Continuity data strategy deployment Balancing recovery time objective with cost / value if needed Recovery from a disk image Recovery from tape copy BC Tier 7 – Add Server or Storage Availability end-to-end automated server Continuous replication with recovery BC Tier 6 – Add real-time continuous data replication, server or storage Rapid Data Recovery BC Tier 5 – Add Application/database integration to Backup/Restore e u a V/ t s o C BC Tier 4 – Add Point in Time replication to Backup/Restore BC Tier 3 – VTL, Data De-Dup, Remote vault Backup/Restore BC Tier 2 – Tape libraries + Automation BC Tier 1 – Restore l 15 Min. 1-4 Hr.. 4 -8 Hr.. 8-12 Hr.. 12-16 Hr.. 24 Hr.. Days from Tape Data Strategy Workload types Recovery Time Objective 57
  • 58. Summary – IT High Availability / Business Continuity Best Practices 2012 Continuous Implement BC Tier 7 – Standardize use of Continuous Availability Availability automated Failover Implement Tier 6 – Standardize high volume data replication method Rapid Data I Recovery Implement Tier 4 – Standardize use of disk to disk and Point in Time disk copy Backup / Implement Tier 3 – Consolidate and standardize Restore Backup/Restore methods. Implement tape VTL, data de-dup, Server / Storage Virtualization / Mgmt tools, basic automation Production Backup/Restore Tier 1, 2 Backup/Restore Tier 1, 2 Foundation: replicated foundation: Storage, server virtualization SAN and server and consolidation virtualization and Understand my data consolidation Define scope of recovery Implement remote Data strategy S sites (Tier 1, 2) Recovery Workload types 58
  • 59. Summary • Understand today’s best practices – Data Workload for IT High Availability and IT Business Continuity Strategy types • What has changed? What is the same? – Principles for requirements = no change • Data Strategy – Deployment for true internet scale wkloads: • Application level redundancy • Strategies for: – Requirements, design, implementation – In-house vs. out-sourcing Cloud deployment • Step by step approach options – Automation, virtualization essential – Segment workloads traditional vs. petabyte scale – Exploiting Cloud 59 59
  • 60. 60

Notas do Editor

  1. http:// en.wikipedia.org/wiki/Disruptive_innovation
  2. First, let’s review important IBM 2009 messaging.
  3. There are three primary aspects of providing business continuity for key applications and business processes: High Availability, Continuous Operations, and Disaster Recovery. Generally the higher in the organization, the simpler the term to use.  Senior execs are responsible for setting vision and strategy.  Mid level more for implementation. So you can get in the door with just BC at the senior level; but you need BC + HA & CO & DR to get in at the Manager, Director, level. “ Business Continuity” was preferred by senior IT executives and line of business titles . Lower IT titles preferred more detailed naming that spelled out the solution components-- they wanted to make it relevant to their more limited responsibilities. High Availability: is the ability to provide access to applications. High availability is often provided by clustering solutions that work with operating systems coupled with hardware infrastructure that has no single points of failure. If a server that is running an application suffers a failure, the application is picked up by another server in the cluster, and users see minimal or no interruption. Today’s servers and storage systems are also built with fault-tolerant architectures to minimize application outages due to hardware failures. In addition, there are many aspects of security imbedded in the hardware from servers to storage to network components to help protect unauthorized access. You can think of high availability as resilient IT infrastructure that masks failures, and thus continues to provide access to applications. Continuous Operations: Sometimes you must take important applications down for purposes of updating files, or taking backups. Fortunately, great progress has been made in recent years in technology for online backups, but even with these advances, sometimes applications must be taken down as planned outages for maintenance or upgrading of servers or storage. You can think of continuous availability is the ability to keep things running when everything is working right... where you do not have to take applications down merely to do scheduled backups or planned maintenance. Disaster Recovery: the ability to recover a datacenter at a different site if a disaster destroys the primary site or otherwise renders it inoperable. The characteristics of a disaster recovery solutions are that processing resumes at a different site, and on different hardware. (A non-disaster problem, such as a corruption of a key customer database, may indeed be a catastrophe for a business, but it is not a disaster , in this sense of the term, unless processing must be resumed at a different location and on different hardware. You can think of disaster recovery as the ability to recover from unplanned outages at a different site, something you do after something has gone wrong. Fortunately, some of the solutions that you can implement as preparedness for disaster recovery, can also help with High Availability and with Continuous Operations. In this way, your investment in disaster recovery can help your operations even if you never suffer a disaster. The goal of business continuity is to protect critical business data, to make key applications available, and to enable operations to continue after a disaster. This must be done in such a way that recovery time is both predictable and reliable, and such that costs are predictable and manageable.
  4. http://www-935.ibm.com/services/us/igs/smarterdatacenter.html
  5. This animated chart is used to organize “who does what“ in a recovery, and to define Recovery Time Objective (RTO) and Recovery Point Objective (RPO). Hardware (servers, storage) can only handle the blue portion of the recovery. All the other necessary processes are important, they are just outside ability of the hardware/servers/storage to control. Hence they should be acknowledged as important, but also, should be supplemental discussions that should be discussed with Services team, and thus outside the scope of a storage-only or Tivoli-only discussion. It‘s good to use this chart to help audience visually organize who does what, in what order, in a recovery.
  6. This animation shows that the previous timeline still applies today. Automation simply makes consistent, the multiple steps of the Timeline of an IT Recovery. Also, Automation provides affordable way to handle testing and compliance of Data Protection solution:
  7. In summary, the animation shows the storage pool concept – mapped to the different technologies: (click) Backup/Restore (click) Rapid Data Recovery (click) Continuous Availability
  8. This slide shows that that Technology only addresses RPO (i.e. how current is the data?). As we improve the technology, we improve RPO. Notice that RTO (Recovery Time Objective) is not driven by technology. (Next chart)
  9. Here we see that automation drives the RTO ( recovery time objective ). Automation is what affects the RTO – because it addresses all the non-technology factors that take time
  10. First, let’s review important IBM 2009 messaging.
  11. Rework title – All Information has a lifespan based on business value
  12. Client Issue: How will technologies evolve to meet the needs of business continuity planning? Strategic Planning Assumption: Data replication for disaster recovery will increase in large enterprises from 25 percent in 2004 to 75 percent by 2006 (0.7 probability).
  13. Example of Application / Database replication: DB2 Queue Replication URL: http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0503aschoff/
  14. *The Data Center as a Computer: Introduction to Warehouse Scale Computing, p.81 Barroso, Holzle http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006
  15. Speed of Decision Making Data volumes have a major effect on "time to analysis" (i.e., the elapsed time between data reception, analysis, presentation, and decision-maker activities). There are four architectural options (i.e., CEP, OLTP/ODS, EDW, and big data), and big data is most appropriate when addressing slow decision cycles that are based on large data volumes. The CEPs requirement for processing hundreds or thousands of transactions per second requires that the decision making be automated using models or business rules. OLTP and ODS support the operational reporting function in which decisions are made at human speed and based on recent data. The EDW — with the time to integrate data from disparate operational systems, process transformations, and compute aggregations — supports historic trend analysis and forecasting. Big data analysis enables the analysis of large volumes of data — larger than can be processed within the EDW — and so supports long-term/strategic and one-off transactional and behavioral analysis. Processing Complexity Processing complexity is the inverse of the speed of decision making. In general, CEP has a relatively simple processing model — although CEP often includes the application of behavioral models and business rules that require complex processing on historic data occurring in the EDW or big data analytics phases of the data-processing pipeline. The requirement to process unstructured data at real-time speeds — for example, in surveillance and intelligence applications — is changing this model. Processing complexity increases through OLTP, ODS, and EDW. Two trends are emerging: OLTP is beginning to include an analytics component within the business process and to utilize in-database analytics. The EDW is exploiting the increasing computational power of the database engine. Processing complexities, and the associated data volumes, are so high within the big data analytics phase that parallel processing is the preferred architectural and algorithmic pattern. Transactional Data Volumes Transactional data volume is the amount of data (either the number of records/events or event size) processed within a single transaction or analysis operation. Modern IT internet architectures process a huge number of discrete base events to compute a sophisticated, pockets of value output. OLTP is similarly concerned with transactional or atomic events. Analysis, with its requirement to process many record simultaneously, starts with ODS, and its complexity grows within the EDW. Big data analytics — with the requirement to model long-term trends and customer behavior on Web clickstream data — processes even larger transactional data volumes. Data Structure The prevalence of non-structured data (semi-, quasi-, and unstructured) increases as the data-processing pipeline is traversed from CEP to big data. The EDW layer is increasingly becoming more heterogeneous as other, often non-structured, data sources are required by the analysis being undertaken. This is having a corresponding effect on processing complexity. The mining of structured data is advanced, and systems and products are optimized for this form of analysis. The mining of non-structured data (e.g., text analytics and image processing) is less well understood, computationally expensive, and often not integrated into the many commercially available analysis tools and packages. One of the primary uses of big data analysis is processing Web clickstream data, which is quasi-structured. In addition, the data is not stored within databases; rather, it is collected and stored within files. Some examples of non-structured data that fit with the big data definition include: log files, clickstream data, shopping card data, social media data, call or support center logs, and telephone call data records (CDRs). There is an increasing requirement to process unstructured data at real-time speeds — for example in surveillance and intelligence applications — so this class of data is becoming more important in CEP processing. Flexibility of Processing/Analysis Data management stakeholders understand the processing and scheduling requirements of transactional processing and operational reporting. The stakeholder's ability to build analysis models is well proven. Peaks and troughs commonly occur across various time intervals (e.g., overnight batch processing window or peak holiday period), but these variations have been studied though trending and forecasting. Big data analysis and a growing percentage of EDW processing are ad hoc or one-off in nature. Data relationships may be poorly understood and require experimentation to refine the analysis. Big data analysis models "analytic heroes" that are continually being challenged "challengers" by new or refined models to see which has better performance or yields better accuracy. The flexibility of such processing is high, and conversely, the governance that can be applied to such processing is low. Throughput Throughput, a measure of the degree of simultaneous execution of transactions, is high in transactional and reporting processing. The high data volumes and complex processing that characterize big data analysis are often hardware constrained and have a low concurrency. The scheduling of big data analysis processing is not time-critical. Big data analysis is therefore not suitable for real-time or near-real-time requirements. = = = = = = = Source for graphic: “InfoSphere Streams Architecture”, Mike Spicer, Chief Architect, InfoSphere Streams, June 2, 2011 Source for quote: Dr. Steve Pratt, CenterPoint Energy, May 25, 2011, IBM Smarter Computing Summit, “Managing the Information Explosion” with Brian Truskowski, between 8:20 and 20:40, http://centerlinebeta.net/smarter-computing-palm-springs/index.html
  16. http://wikibon.org/wiki/v/Big_Data:_Hadoop%2C_Business_Analytics_and_Beyond A Hadoop “stack” is made up of a number of components. They include: Hadoop Distributed File System (HDFS): The default storage layer in any given Hadoop cluster; Name Node: The node in a Hadoop cluster that provides the client information on where in the cluster particular data is stored and if any nodes fail; Secondary Node: A backup to the Name Node, it periodically replicates and stores data from the Name Node should it fail; Job Tracker: The node in a Hadoop cluster that initiates and coordinates MapReduce jobs, or the processing of the data. Slave Nodes: The grunts of any Hadoop cluster, slave nodes store data and take direction to process it from the Job Tracker. In addition to the above, the Hadoop ecosystem is made up of a number of complimentary sub-projects. NoSQL data stores like Cassandra and HBase are also used to store the results of MapReduce jobs in Hadoop. In addition to Java, some MapReduce jobs and other Hadoop functions are written in Pig, an open source language designed specifically for Hadoop. Hive is an open source data warehouse originally developed by Facebook that allows for analytic modeling within Hadoop. Following is a guide to Hadoop's components: Hadoop Distributed File System:  HDFS, the storage layer of Hadoop, is a distributed, scalable, Java-based file system adept at storing large volumes of unstructured data. MapReduce:  MapReduce is a software framework that serves as the compute layer of Hadoop. MapReduce jobs are divided into two (obviously named) parts. The “Map” function divides a query into multiple parts and processes data at the node level. The “Reduce” function aggregates the results of the “Map” function to determine the “answer” to the query. Hive:  Hive is a Hadoop-based data warehouse developed by Facebook. It allows users to write queries in SQL, which are then converted to MapReduce. This allows SQL programmers with no MapReduce experience to use the warehouse and makes it easier to integrate with business intelligence and visualization tools such as Microstrategy, Tableau, Revolutions Analytics, etc. Pig:  Pig Latin is a Hadoop-based language developed by Yahoo. It is relatively easy to learn and is adept at very deep, very long data pipelines (a limitation of SQL.) HBase:  HBase is a non-relational database that allows for low-latency, quick lookups in Hadoop. It adds transactional capabilities to Hadoop, allowing users to conduct updates, inserts and deletes. EBay and Facebook use HBase heavily. Flume:  Flume is a framework for populating Hadoop with data. Agents are populated throughout ones IT infrastructure – inside web servers, application servers and mobile devices, for example – to collect data and integrate it into Hadoop. Oozie:  Oozie is a workflow processing system that lets users define a series of jobs written in multiple languages – such as Map Reduce, Pig and Hive -- then intelligently link them to one another. Oozie allows users to specify, for example, that a particular query is only to be initiated after specified previous jobs on which it relies for data are completed. Whirr:  Whirr is a set of libraries that allows users to easily spin-up Hadoop clusters on top of Amazon EC2, Rackspace or any virtual infrastructure. It supports all major virtualized infrastructure vendors on the market. Avro:  Avro is a data serialization system that allows for encoding the schema of Hadoop files. It is adept at parsing data and performing removed procedure calls. Mahout:  Mahout is a data mining library. It takes the most popular data mining algorithms for performing clustering, regression testing and statistical modeling and implements them using the Map Reduce model. Sqoop:  Sqoop is a connectivity tool for moving data from non-Hadoop data stores – such as relational databases and data warehouses – into Hadoop. It allows users to specify the target location inside of Hadoop and instruct Sqoop to move data from Oracle, Teradata or other relational databases to the target. BigTop:  BigTop is an effort to create a more formal process or framework for packaging and interoperability testing of Hadoop's sub-projects and related components with the goal improving the Hadoop platform as a whole.
  17. Understanding your data, categorizing it by recovery time, is essential in order to build a cost-justifiable, affordable solution. Finally, not every client can justify near continuous availability or rapid data recovery solutions. A balance between the priorities of uptime and cost, in concert with the needs of the business, is always necessary. For example, many clients may find that the appropriate cost/recovery time equation is that it is not necessary for the data at the remote site to be within seconds; the requirement is only for the data at the remote site to be no more than 12 hours old. These types of recoveries do not required on-going real-time consistent update of data at a remote site. Rather, only a periodic point-in-time copy needs to be made (on disk or on tape for the lower tiers), and the simply replicate the copies to a remote site Server and workload restart is semi-automated or manual.
  18. Data center complexity has reached crisis levels and is continuing to increase thereby limiting improvement and growth Businesses spend a large fraction of their IT budgets on data center resource management rather than on valuable applications and business processes IT management costs are the dominant IT cost component today and have increased over the past ten years in rough proportion to increasing scale-out sprawl Basic forces will drive continuing increases in IT complexity The numbers of systems deployed will continue to grow rapidly, driven largely by: New applications (for Web 2.0, surveillance, operational asset mgmt., ...) Improving hardware price/performance and utilization (more systems per server) The diversity of IT products will increase as competing suppliers continue to introduce new applications, systems, and management software products The coupling of IT components is extensive and increasing, driven by application tiering, growing SOA usage, advances in high-performance standard networks, … The resulting increase in IT complexity will further exacerbate the current IT management cost crisis. Managing the increasing IT complexity and scale-out sprawl with traditional IT management software will be increasingly difficult and costly New approaches to Data Center Architectures are needed to simplify IT management and enable growth
  19. In summary, the animation shows the storage pool concept – mapped to the different technologies: (click) Backup/Restore (click) Rapid Data Recovery (click) Continuous Availability
  20. This is an optional chart, showing the typical Information Availability System Storage technologies that we would apply to the various pools of storage (click) including the fact that large unstructured data probably needs to be recovered using file system or application involvement.
  21. Here is another way of showing the same step by step, incremental-improve concept. It’s a ‘big picture’ positioning the various kinds of technologies that can be deployed, step by step, to provide IT BC solutions – starting from low and moving to the high end of the cost curve. Click to show each one of the steps to come up. Note how the icons show where the data flows, through different types of technologies that we will discuss further today.
  22. Building upon the previous chart, we continue clicking to show enhancements to: Rapid Data Recovery capabilities, followed by Continuous Availability capabilities. (This chart starts from where the previous chart left off
  23. This is an optional chart, showing the typical Information Availability System Storage technologies that we would apply to the various pools of storage (click) including the fact that large unstructured data probably needs to be recovered using file system or application involvement.
  24. This is an optional chart, showing the typical Information Availability System Storage technologies that we would apply to the various pools of storage (click) including the fact that large unstructured data probably needs to be recovered using file system or application involvement.
  25. This is an optional chart, showing the typical Information Availability System Storage technologies that we would apply to the various pools of storage (click) including the fact that large unstructured data probably needs to be recovered using file system or application involvement.
  26. This is an optional chart, showing the typical Information Availability System Storage technologies that we would apply to the various pools of storage (click) including the fact that large unstructured data probably needs to be recovered using file system or application involvement.
  27. In summary, the animation shows the storage pool concept – mapped to the different technologies: (click) Backup/Restore (click) Rapid Data Recovery (click) Continuous Availability
  28. In summary, the animation shows the storage pool concept – mapped to the general categories: (click) Backup/Restore (click) Rapid Data Recovery (click) Continuous Availability
  29. Here’s yet another way to look at this process. Each step of the process that we’ve reviewed, are shown here in a build-up, step by step project visualiization. In this case, we show how the Timeline of an IT Recovery is improved at each step.
  30. Thank you!