SlideShare uma empresa Scribd logo
1 de 25
Baixar para ler offline
Disposable Environments at Scale


        / or: How I Learned to Stop Worrying and Love ZFS
Eric Sproul


   Build Engineer
     Sysadmin
    Consultant


     Twitter:
     @eirescot
Recipe for Success

  enablement (n): the act of providing (someone)
  with adequate power, means, opportunity or authority
  (to do something)


  This is a story of how ZFS enabled business success
Background

   Etsy is the world's handmade marketplace

             Experiencing rapid growth

    Every department needed to understand
        how the business was evolving

              Asked OmniTI for help
The Situation

                  No data warehouse

            Analytical queries against
               PostgreSQL OLTP

           Large initial size (~250 GB)

                Forecast to reach 1 TB+
                     within a year
Problems

           Long-running queries to OLTP
             destroy web performance

                Re-running reports
             produces different results

           Inflexible reporting interface
Requirements

               Relieve pressure from
                  OLTP database

               Make production data
               continuously available

                Enable correlation
                 of other sources

               Flexible reporting UI
Solution

           Create separate BI analytics DB

                   Build it on ZFS
Initial Capabilities


            ETL to collate table-level data
              from multiple databases

              Run deep analytic queries
              without impacting website

       New web UI enables ad-hoc reporting
Reaping the Benefits of ZFS

                    Snapshots

                  Faster backups

              Simple replica creation
Reaping the Benefits of ZFS

                 Compression
           Extend usable life of storage:
                  PgSQL logical: 1.3T
                  On-disk: 653G (2.0x)


              Intelligent resilver
               Shorter rebuilds ==
             Reduced risk of data loss
We Want More!

       Monthly reports now on BI system

         Occasional problems require
             re-running reports

           Still get different results,
                 same as before
We Want More!

          Need to test report changes

        Fine to dev with small mock-up

          Staging requires something
           that looks like production
The Next Level

          Disposable environments

                  Run on slave replica

                 R/W copy of BI data

                 Discard when finished
Disposable Environment
                  Use a non-global zone

    set   zonepath=/zones/bistage
    set   autoboot=true
    set   limitpriv=default,dtrace_proc,dtrace_user
    set   ip-type=shared
    add   net
    set   address=10.1.2.3
    set   physical=bnx0
    end
    add   dataset
    set   name=bi01tank/stage
    end
Disposable Environment
            Starting state: ZFS datasets

         bi01tank/pgsql/data
         bi01tank/pgsql/data/91
         bi01tank/pgsql/wal_archive
         bi01tank/pgsql/wal_archive/91
Disposable Environment
                    Take snapshots
         zfs snapshot -r bi01tank/pgsql/data@stage

         bi01tank/pgsql/data@stage
         bi01tank/pgsql/data/91@stage
         bi01tank/pgsql/wal_archive@stage
         bi01tank/pgsql/wal_archive/91@stage
Disposable Environment
                      Create clones
         zfs clone <src_dataset>@stage <dst_dataset>

         bi01tank/pgsql/data@stage
         bi01tank/pgsql/data/91@stage
         bi01tank/pgsql/wal_archive@stage
         bi01tank/pgsql/wal_archive/91@stage

         bi01tank/stage/data
         bi01tank/stage/data/91
         bi01tank/stage/wal_archive
         bi01tank/stage/wal_archive/91
Disposable Environment

    Zone now sees a full, writable copy of data

     Unchanged data is referenced to origin

          Changes accumulate to clone
Disposable Environment



   pgsql/data/91   pgsql/data/91@stage   stage/data/91


                                                         Unchanged data
                                                         referenced from
                                                         snapshot

                                                         Change accounted
                                                         to clone

     Live FS           Snapshot             Clone
Next-Level Results

             Any report can be re-run
                on the same data

             Massage existing data or
          bring more in for ad-hoc report

        Test changes to reports and web UI
Next-Level Results

      When finished with the environment:

                 Shut down zone

            Delete clone & origin snap
Return on Investment

         BI database runs on 2 machines

        OLTP database lifetime extended
          two years past expectation

   Faster, more granular, and ad-hoc reporting
    enables better decisions by management
Bonus!

         With the same technique,
            we can safely test:

          PostgreSQL upgrades

             Schema changes
Thank You

           ZFS, Zones and much more
       are available to the community via
          illumos and its distributions

       Go forth and enable your business!

Mais conteúdo relacionado

Semelhante a Disposable Environments at Scale

Building the Perfect SharePoint 2010 Farm - SPS Brisbane 2011
Building the Perfect SharePoint 2010 Farm - SPS Brisbane 2011Building the Perfect SharePoint 2010 Farm - SPS Brisbane 2011
Building the Perfect SharePoint 2010 Farm - SPS Brisbane 2011Michael Noel
 
Testing Delphix: easy data virtualization
Testing Delphix: easy data virtualizationTesting Delphix: easy data virtualization
Testing Delphix: easy data virtualizationFranck Pachot
 
Ceph - High Performance Without High Costs
Ceph - High Performance Without High CostsCeph - High Performance Without High Costs
Ceph - High Performance Without High CostsJonathan Long
 
GWAB 2015 - Data Plaraform
GWAB 2015 - Data PlaraformGWAB 2015 - Data Plaraform
GWAB 2015 - Data PlaraformMarcelo Paiva
 
Building the Perfect SharePoint 2010 Farm - TechEd Australia 2011
Building the Perfect SharePoint 2010 Farm - TechEd Australia 2011Building the Perfect SharePoint 2010 Farm - TechEd Australia 2011
Building the Perfect SharePoint 2010 Farm - TechEd Australia 2011Michael Noel
 
Clustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityClustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityConSanFrancisco123
 
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and InfrastrctureRevolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and Infrastrcturesabnees
 
VMware Technology: Deliver Predictable Application Performance & Improve Infr...
VMware Technology: Deliver Predictable Application Performance & Improve Infr...VMware Technology: Deliver Predictable Application Performance & Improve Infr...
VMware Technology: Deliver Predictable Application Performance & Improve Infr...NetApp
 
Azure SQL - more or/and less than SQL Server
Azure SQL - more or/and less than SQL ServerAzure SQL - more or/and less than SQL Server
Azure SQL - more or/and less than SQL ServerRafał Hryniewski
 
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Alluxio, Inc.
 
Building the Perfect SharePoint 2010 Farm - SharePoint Saturday NYC 2011
Building the Perfect SharePoint 2010 Farm - SharePoint Saturday NYC 2011Building the Perfect SharePoint 2010 Farm - SharePoint Saturday NYC 2011
Building the Perfect SharePoint 2010 Farm - SharePoint Saturday NYC 2011Michael Noel
 
Handling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsHandling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsDirecti Group
 
[2C6]SQLite DB 의 입출력 특성분석 : Android 와 Tizen 사례
[2C6]SQLite DB 의 입출력 특성분석 : Android 와 Tizen 사례[2C6]SQLite DB 의 입출력 특성분석 : Android 와 Tizen 사례
[2C6]SQLite DB 의 입출력 특성분석 : Android 와 Tizen 사례NAVER D2
 
Exploring Scalability, Performance And Deployment
Exploring Scalability, Performance And DeploymentExploring Scalability, Performance And Deployment
Exploring Scalability, Performance And Deploymentrsnarayanan
 
Stac summit june 14th - goodbye datalakes
Stac summit june 14th - goodbye datalakesStac summit june 14th - goodbye datalakes
Stac summit june 14th - goodbye datalakesiguazio
 
SharePoint Saturday Michigan Keynote - Top 5 Infrastructure Concerns for a Sh...
SharePoint Saturday Michigan Keynote - Top 5 Infrastructure Concerns for a Sh...SharePoint Saturday Michigan Keynote - Top 5 Infrastructure Concerns for a Sh...
SharePoint Saturday Michigan Keynote - Top 5 Infrastructure Concerns for a Sh...Michael Noel
 
Azure Data platform
Azure Data platformAzure Data platform
Azure Data platformMostafa
 
High Availbilty In Sql Server
High Availbilty In Sql ServerHigh Availbilty In Sql Server
High Availbilty In Sql ServerRishikesh Tiwari
 
Expert summit SQL Server 2016
Expert summit   SQL Server 2016Expert summit   SQL Server 2016
Expert summit SQL Server 2016Łukasz Grala
 

Semelhante a Disposable Environments at Scale (20)

Building the Perfect SharePoint 2010 Farm - SPS Brisbane 2011
Building the Perfect SharePoint 2010 Farm - SPS Brisbane 2011Building the Perfect SharePoint 2010 Farm - SPS Brisbane 2011
Building the Perfect SharePoint 2010 Farm - SPS Brisbane 2011
 
Testing Delphix: easy data virtualization
Testing Delphix: easy data virtualizationTesting Delphix: easy data virtualization
Testing Delphix: easy data virtualization
 
Ceph - High Performance Without High Costs
Ceph - High Performance Without High CostsCeph - High Performance Without High Costs
Ceph - High Performance Without High Costs
 
GWAB 2015 - Data Plaraform
GWAB 2015 - Data PlaraformGWAB 2015 - Data Plaraform
GWAB 2015 - Data Plaraform
 
Building the Perfect SharePoint 2010 Farm - TechEd Australia 2011
Building the Perfect SharePoint 2010 Farm - TechEd Australia 2011Building the Perfect SharePoint 2010 Farm - TechEd Australia 2011
Building the Perfect SharePoint 2010 Farm - TechEd Australia 2011
 
Clustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityClustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And Availability
 
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and InfrastrctureRevolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
 
VMware Technology: Deliver Predictable Application Performance & Improve Infr...
VMware Technology: Deliver Predictable Application Performance & Improve Infr...VMware Technology: Deliver Predictable Application Performance & Improve Infr...
VMware Technology: Deliver Predictable Application Performance & Improve Infr...
 
Azure SQL - more or/and less than SQL Server
Azure SQL - more or/and less than SQL ServerAzure SQL - more or/and less than SQL Server
Azure SQL - more or/and less than SQL Server
 
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
 
Building the Perfect SharePoint 2010 Farm - SharePoint Saturday NYC 2011
Building the Perfect SharePoint 2010 Farm - SharePoint Saturday NYC 2011Building the Perfect SharePoint 2010 Farm - SharePoint Saturday NYC 2011
Building the Perfect SharePoint 2010 Farm - SharePoint Saturday NYC 2011
 
Handling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsHandling Data in Mega Scale Systems
Handling Data in Mega Scale Systems
 
[2C6]SQLite DB 의 입출력 특성분석 : Android 와 Tizen 사례
[2C6]SQLite DB 의 입출력 특성분석 : Android 와 Tizen 사례[2C6]SQLite DB 의 입출력 특성분석 : Android 와 Tizen 사례
[2C6]SQLite DB 의 입출력 특성분석 : Android 와 Tizen 사례
 
Exploring Scalability, Performance And Deployment
Exploring Scalability, Performance And DeploymentExploring Scalability, Performance And Deployment
Exploring Scalability, Performance And Deployment
 
Stac summit june 14th - goodbye datalakes
Stac summit june 14th - goodbye datalakesStac summit june 14th - goodbye datalakes
Stac summit june 14th - goodbye datalakes
 
SharePoint Saturday Michigan Keynote - Top 5 Infrastructure Concerns for a Sh...
SharePoint Saturday Michigan Keynote - Top 5 Infrastructure Concerns for a Sh...SharePoint Saturday Michigan Keynote - Top 5 Infrastructure Concerns for a Sh...
SharePoint Saturday Michigan Keynote - Top 5 Infrastructure Concerns for a Sh...
 
Azure Data platform
Azure Data platformAzure Data platform
Azure Data platform
 
High Availbilty In Sql Server
High Availbilty In Sql ServerHigh Availbilty In Sql Server
High Availbilty In Sql Server
 
Expert summit SQL Server 2016
Expert summit   SQL Server 2016Expert summit   SQL Server 2016
Expert summit SQL Server 2016
 

Disposable Environments at Scale

  • 1. Disposable Environments at Scale / or: How I Learned to Stop Worrying and Love ZFS
  • 2. Eric Sproul Build Engineer Sysadmin Consultant Twitter: @eirescot
  • 3. Recipe for Success enablement (n): the act of providing (someone) with adequate power, means, opportunity or authority (to do something) This is a story of how ZFS enabled business success
  • 4. Background Etsy is the world's handmade marketplace Experiencing rapid growth Every department needed to understand how the business was evolving Asked OmniTI for help
  • 5. The Situation No data warehouse Analytical queries against PostgreSQL OLTP Large initial size (~250 GB) Forecast to reach 1 TB+ within a year
  • 6. Problems Long-running queries to OLTP destroy web performance Re-running reports produces different results Inflexible reporting interface
  • 7. Requirements Relieve pressure from OLTP database Make production data continuously available Enable correlation of other sources Flexible reporting UI
  • 8. Solution Create separate BI analytics DB Build it on ZFS
  • 9. Initial Capabilities ETL to collate table-level data from multiple databases Run deep analytic queries without impacting website New web UI enables ad-hoc reporting
  • 10. Reaping the Benefits of ZFS Snapshots Faster backups Simple replica creation
  • 11. Reaping the Benefits of ZFS Compression Extend usable life of storage: PgSQL logical: 1.3T On-disk: 653G (2.0x) Intelligent resilver Shorter rebuilds == Reduced risk of data loss
  • 12. We Want More! Monthly reports now on BI system Occasional problems require re-running reports Still get different results, same as before
  • 13. We Want More! Need to test report changes Fine to dev with small mock-up Staging requires something that looks like production
  • 14. The Next Level Disposable environments Run on slave replica R/W copy of BI data Discard when finished
  • 15. Disposable Environment Use a non-global zone set zonepath=/zones/bistage set autoboot=true set limitpriv=default,dtrace_proc,dtrace_user set ip-type=shared add net set address=10.1.2.3 set physical=bnx0 end add dataset set name=bi01tank/stage end
  • 16. Disposable Environment Starting state: ZFS datasets bi01tank/pgsql/data bi01tank/pgsql/data/91 bi01tank/pgsql/wal_archive bi01tank/pgsql/wal_archive/91
  • 17. Disposable Environment Take snapshots zfs snapshot -r bi01tank/pgsql/data@stage bi01tank/pgsql/data@stage bi01tank/pgsql/data/91@stage bi01tank/pgsql/wal_archive@stage bi01tank/pgsql/wal_archive/91@stage
  • 18. Disposable Environment Create clones zfs clone <src_dataset>@stage <dst_dataset> bi01tank/pgsql/data@stage bi01tank/pgsql/data/91@stage bi01tank/pgsql/wal_archive@stage bi01tank/pgsql/wal_archive/91@stage bi01tank/stage/data bi01tank/stage/data/91 bi01tank/stage/wal_archive bi01tank/stage/wal_archive/91
  • 19. Disposable Environment Zone now sees a full, writable copy of data Unchanged data is referenced to origin Changes accumulate to clone
  • 20. Disposable Environment pgsql/data/91 pgsql/data/91@stage stage/data/91 Unchanged data referenced from snapshot Change accounted to clone Live FS Snapshot Clone
  • 21. Next-Level Results Any report can be re-run on the same data Massage existing data or bring more in for ad-hoc report Test changes to reports and web UI
  • 22. Next-Level Results When finished with the environment: Shut down zone Delete clone & origin snap
  • 23. Return on Investment BI database runs on 2 machines OLTP database lifetime extended two years past expectation Faster, more granular, and ad-hoc reporting enables better decisions by management
  • 24. Bonus! With the same technique, we can safely test: PostgreSQL upgrades Schema changes
  • 25. Thank You ZFS, Zones and much more are available to the community via illumos and its distributions Go forth and enable your business!