SlideShare uma empresa Scribd logo
1 de 33
Consolidated Sharded
                    Indexes in Real-Time
                                   Jeff Mace



©Continuent 2012.
About Continuent

              • The leading provider of clustering and
                   replication for open source DBMS

              • Tungsten Clustering - Commercial-grade
                   HA, performance scaling and data
                   management for MySQL

              • Tungsten Replication - Flexible, high-
                   performance replication




©Continuent 2012                     2
Working with Sharded Data

              • Schema sharding is a proven strategy for
                   many SaaS application providers

              • Easily add new servers to handle growth
              • Costly operations by one customer do not
                   affect all other customers

              • But it’s difficult to work with data spread
                   across all schemas




©Continuent 2012                        3
An Example

              • A franchise business with many locations
              • Needs a consolidated table of all accounts
                   and their balance is needed for BI

              • Some lag is ok but not more than a few
                   seconds

              • This example can be applied to many
                   scenarios using the same techniques




©Continuent 2012                      4
An Example

                   NYC Branch                SFO Branch
         id              balance       id         balance
          1        $234.78              1   $820.20
          2        $892.24              2   $240.27
          3        $1,023.76            3   $527.63
          4        $521.08
          5        $982.62




©Continuent 2012                   5
An Example

                   id          balance    schema
                    1   $234.78          nyc
                    2   $892.24          nyc
                    1   $820.20          sfo
                    2   $240.27          sfo
                    3   $1,023.76        nyc
                    4   $521.08          nyc
                    5   $982.62          nyc
                    3   $527.63          sfo




©Continuent 2012                     6
What can we do about it?

              • Make the application do it
              • Run a batch process to dump and load
                   data

              • Replicate into a central schema with
                   Tungsten Replicator




©Continuent 2012                     7
Tungsten Replicator

              • GPL v2
              • Global Transaction IDs
              • Multi-Master Replication
              • Parallel Replication
              • Heterogenous Replication
              • Supports MySQL 5.0 and up


©Continuent 2012                 8
Tungsten Replicator
          Master-Slave      Heterogenous   Direct




                   Fan-In   All-Masters    Star




©Continuent 2012                 9
Replication Services


                   Stage                       Stage                      Stage
    Extrac         Filter   Apply     Extrac   Filter   Apply    Extrac   Filter   Apply




    Master                    Transaction                  In-Memory               Slave
    DBMS                      History Log                    Queue                 DBMS




©Continuent 2012                                   10
Transaction History Log
                                                        Event Header
                                                            Seqno
                       Header Record                        Fragno
                                                          Last_frag
                   (version & first seqno)              Epoch_number
                                                          Source_id
                                                           Event_id
                      Event Record                        Shard_id
                                                           Tstamp
                                                         Data_length
                      Event Record                 (Java Primitive Types)


                           ...                   Serialized Event
                                                   (Google Protobufs,
                      Event Record                    Up to 2Gb)

                   Log Rotation Record                    CRC
                    (name of next file)            (Java Primitive Types)



©Continuent 2012                            11
Filters

              • Modify THL events during replication
              • Can be written in Java or JavaScript
              • JavaScript is compiled at runtime
              • Drop all or part of a THL event
              • Modify the contents of a THL event
              • Insert new statements or rows into a THL
                   event

              • Use your imagination
©Continuent 2012                  12
Putting It All Together

              • Filtering allows us to modify ROW
                   replication events before applying them
                   •   dropstatementdata

                   •   replicate

                   •   replicatecolumns

                   •   buildindextable

              • Apply to the master server with log slave
                   updates

              • Integrate with other servers to ensure
©Continuent 2012                           13
Drop Statement Data Filter
          Definition
  # Remove statement events and drop the event if the result is
  empty

  replicator.filter.dropstatementdata=com.continuent.tungsten.re
  plicator.filter.JavaScriptFilter
  replicator.filter.dropstatementdata.script=$
  {replicator.home.dir}/samples/extensions/javascript/
  dropstatementdata.js




©Continuent 2012                14
Replicate Filter Definition
  # Filter to forward or ignore particular schemas and/or
  databases. Entries
  # are comma-separated lists of the form schema[.table] where
  the table is
  # optional. List entries may use * and ? as wild cards. When
  both
  # filter lists are empty updates on all tables are allowed.

  replicator.filter.replicate=com.continuent.tungsten.replicator.fil
  ter.ReplicateFilter
  replicator.filter.replicate.ignore=
  replicator.filter.replicate.do=*.account




©Continuent 2012                 15
ReplicateColumns Filter Definition
  # Join rows from one table in many schemas, to a table in a
  single schema

  replicator.filter.replicatecolumns=com.continuent.tungsten.repl
  icator.filter.ReplicateColumnsFilter
  replicator.filter.replicatecolumns.do=
  replicator.filter.replicatecolumns.ignore=account.filler




©Continuent 2012                 16
ReplicateColumns : Before
  - SQL(27) =
   - ACTION = INSERT
   - SCHEMA = e3
   - TABLE = account
   - ROW# = 0
    - COL(1: account_id) = 27
    - COL(2: branch_id) = 0
    - COL(3: account_balance) = 0
    - COL(4: filler) =
  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
  XX
    - COL(5: time_stamp) = 2012-09-28 11:50:36.0



©Continuent 2012          17
ReplicateColumns : After
  - SQL(27) =
   - ACTION = INSERT
   - SCHEMA = e3
   - TABLE = account
   - ROW# = 0
    - COL(1: account_id) = 27
    - COL(2: branch_id) = 0
    - COL(3: account_balance) = 0
    - COL(5: time_stamp) = 2012-09-28 11:50:36.0




©Continuent 2012              18
Index Filter Definition
  # Join rows from one table in many schemas, to a table in a
  single schema

  replicator.filter.buildindextable=com.continuent.tungsten.repli
  cator.filter.BuildIndexTable
  replicator.filter.buildindextable.target_schema_name=bi




©Continuent 2012                 19
Index Filter : Before
  - SQL(27) =
   - ACTION = INSERT
   - SCHEMA = e3
   - TABLE = account
   - ROW# = 0
    - COL(1: account_id) = 27
    - COL(2: branch_id) = 0
    - COL(3: account_balance) = 0
    - COL(5: time_stamp) = 2012-09-28 11:50:36.0




©Continuent 2012              20
Index Filter : After
  - SQL(27) =
   - ACTION = INSERT
   - SCHEMA = test
   - TABLE = account
   - ROW# = 0
    - COL(0: account_id) = 27
    - COL(0: branch_id) = 0
    - COL(0: account_balance) = 0
    - COL(0: time_stamp) = 2012-09-28 11:50:36.0
    - COL(0: schema) = e3




©Continuent 2012              21
Requirements

              • Standard Tungsten Requirements
                   •   Java

                   •   Ruby

              • Row Replication
              • Unique schema names
              • Target schema with an additional
                   ‘schema’ column




©Continuent 2012                     22
Installation - Master

              • Normal




©Continuent 2012           23
Installation - Slave

              • Some hacks to write back to the master
              • Some filters to make the changes




©Continuent 2012                  24
Demo




©Continuent 2012   25
Replicating the index table

              • Setup a normal master-slave pair
              • Could be a Tungsten cluster
              • Set up the indexing replicator to apply
                   events to the master or through the
                   connector




©Continuent 2012                     26
Provisioning

              • Not as simple as restoring a backup
              • New loader command to initialize the
                   master THL or slave database

              • Creates a set of row inserts for each table
              • Supports extracting from MySQL
              • Uses FLUSH TABLES WITH READ LOCK


©Continuent 2012                    27
Provisioning Master THL
  $ /opt/continuent/tungsten/tungsten-replicator/bin/loader 
  -extractor 
  com.continuent.tungsten.replicator.loader.MySQLLoader 
  -extractor.uri "mysql://mdb-1.local:3306/" 
  -extractor.user tungsten 
  -extractor.password secret 
  -extractor.includeSchemas e1,e2,e3,e4,e5




©Continuent 2012               28
Provisioning Slave Database
  $ /opt/indexer/tungsten/tungsten-replicator/bin/loader 
  -extractor 
  com.continuent.tungsten.replicator.loader.MySQLLoader 
  -extractor.uri "mysql://mdb-2.local:3306/" 
  -extractor.user tungsten 
  -extractor.password secret 
  -extractor.includeSchemas e1,e2,e3,e4,e5 
  -extractor.tungstenServiceSchema tungsten_globalidx




©Continuent 2012               29
Demo




©Continuent 2012   30
Next Steps

              • Loading big data
              • Replicate from many masters
              • Support non-row statements
                   •   Drop schema

                   •   Drop table

                   •   Truncate table

              • Expand the provisioning support to
                   extract from Oracle

              • https://docs.continuent.com/wiki/x/uIAz
©Continuent 2012                        31
We’re Hiring

              • Cluster Implementation Engineer
              • Quality Assurance Engineer
              • Technical Writer




©Continuent 2012                 32
Jeff Mace
       jeff.mace@continuent.com
       sales@continuent.com
       560 S. Winchester Blvd. Suite 500
       San Jose, CA 95128
       Tel (866) 998-3642
       Fax (408) 668-1009


                              http://www.continuent.com
                   http://code.google.com/p/tungsten-replicator

©Continuent 2012                           19
                                           33

Mais conteúdo relacionado

Mais procurados

Flash Camp Chennai - Social network with ORM
Flash Camp Chennai - Social network with ORMFlash Camp Chennai - Social network with ORM
Flash Camp Chennai - Social network with ORM
RIA RUI Society
 
Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical
ISSGC Summer School
 
Solving_the_C20K_problem_PHP_Performance_and_Scalability-phpquebec_2009
Solving_the_C20K_problem_PHP_Performance_and_Scalability-phpquebec_2009Solving_the_C20K_problem_PHP_Performance_and_Scalability-phpquebec_2009
Solving_the_C20K_problem_PHP_Performance_and_Scalability-phpquebec_2009
Hiroshi Ono
 
Omid Efficient Transaction Mgmt and Processing for HBase
Omid Efficient Transaction Mgmt and Processing for HBaseOmid Efficient Transaction Mgmt and Processing for HBase
Omid Efficient Transaction Mgmt and Processing for HBase
DataWorks Summit
 
Linked In Lessons Learned And Growth And Scalability
Linked In Lessons Learned And Growth And ScalabilityLinked In Lessons Learned And Growth And Scalability
Linked In Lessons Learned And Growth And Scalability
ConSanFrancisco123
 
Performance evaluation of cloudera impala 0.6 beta with comparison to Hive
Performance evaluation of cloudera impala 0.6 beta with comparison to HivePerformance evaluation of cloudera impala 0.6 beta with comparison to Hive
Performance evaluation of cloudera impala 0.6 beta with comparison to Hive
Yukinori Suda
 

Mais procurados (19)

Flash Camp Chennai - Social network with ORM
Flash Camp Chennai - Social network with ORMFlash Camp Chennai - Social network with ORM
Flash Camp Chennai - Social network with ORM
 
Throwing complexity over the wall: Rapid development for enterprise Java (Jav...
Throwing complexity over the wall: Rapid development for enterprise Java (Jav...Throwing complexity over the wall: Rapid development for enterprise Java (Jav...
Throwing complexity over the wall: Rapid development for enterprise Java (Jav...
 
Session9part2 Servers Detailed
Session9part2  Servers DetailedSession9part2  Servers Detailed
Session9part2 Servers Detailed
 
GIT Introduction
GIT IntroductionGIT Introduction
GIT Introduction
 
Omid: Efficient Transaction Management and Incremental Processing for HBase (...
Omid: Efficient Transaction Management and Incremental Processing for HBase (...Omid: Efficient Transaction Management and Incremental Processing for HBase (...
Omid: Efficient Transaction Management and Incremental Processing for HBase (...
 
The Native NDB Engine for Memcached
The Native NDB Engine for MemcachedThe Native NDB Engine for Memcached
The Native NDB Engine for Memcached
 
Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical
 
Solving_the_C20K_problem_PHP_Performance_and_Scalability-phpquebec_2009
Solving_the_C20K_problem_PHP_Performance_and_Scalability-phpquebec_2009Solving_the_C20K_problem_PHP_Performance_and_Scalability-phpquebec_2009
Solving_the_C20K_problem_PHP_Performance_and_Scalability-phpquebec_2009
 
groovy
groovygroovy
groovy
 
Omid Efficient Transaction Mgmt and Processing for HBase
Omid Efficient Transaction Mgmt and Processing for HBaseOmid Efficient Transaction Mgmt and Processing for HBase
Omid Efficient Transaction Mgmt and Processing for HBase
 
Auto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud Workflows
Auto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud WorkflowsAuto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud Workflows
Auto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud Workflows
 
Cloud Foundry Open Tour - London
Cloud Foundry Open Tour - LondonCloud Foundry Open Tour - London
Cloud Foundry Open Tour - London
 
Session18 Madduri
Session18  MadduriSession18  Madduri
Session18 Madduri
 
Xs sho niboshi
Xs sho niboshiXs sho niboshi
Xs sho niboshi
 
Groovy.Tutorial
Groovy.TutorialGroovy.Tutorial
Groovy.Tutorial
 
Linked In Lessons Learned And Growth And Scalability
Linked In Lessons Learned And Growth And ScalabilityLinked In Lessons Learned And Growth And Scalability
Linked In Lessons Learned And Growth And Scalability
 
Exchange 2010 ha ctd
Exchange 2010 ha ctdExchange 2010 ha ctd
Exchange 2010 ha ctd
 
Performance evaluation of cloudera impala 0.6 beta with comparison to Hive
Performance evaluation of cloudera impala 0.6 beta with comparison to HivePerformance evaluation of cloudera impala 0.6 beta with comparison to Hive
Performance evaluation of cloudera impala 0.6 beta with comparison to Hive
 
Os Vilain
Os VilainOs Vilain
Os Vilain
 

Destaque

Newcastle-under-Lyme College\'s Full-time prospectus 2012-13
Newcastle-under-Lyme College\'s Full-time prospectus 2012-13Newcastle-under-Lyme College\'s Full-time prospectus 2012-13
Newcastle-under-Lyme College\'s Full-time prospectus 2012-13
Joe_Hambleton
 
Flexible heterogenous replication
Flexible heterogenous replicationFlexible heterogenous replication
Flexible heterogenous replication
Jeff Mace
 
Film Trailer Questionnaire Results
Film Trailer Questionnaire ResultsFilm Trailer Questionnaire Results
Film Trailer Questionnaire Results
lydiadeb
 
Disaster Recovery with MySQL and Tungsten
Disaster Recovery with MySQL and TungstenDisaster Recovery with MySQL and Tungsten
Disaster Recovery with MySQL and Tungsten
Jeff Mace
 
Evaluation Activity 1
Evaluation Activity 1Evaluation Activity 1
Evaluation Activity 1
lydiadeb
 
Questionnaire results
Questionnaire resultsQuestionnaire results
Questionnaire results
lydiadeb
 
Os programas de cualificación profesional inicial
Os programas de cualificación profesional inicialOs programas de cualificación profesional inicial
Os programas de cualificación profesional inicial
Silvina Paricio Tato
 
The Help vs. Prometheus
The Help vs. PrometheusThe Help vs. Prometheus
The Help vs. Prometheus
lydiadeb
 

Destaque (17)

Newcastle-under-Lyme College\'s Full-time prospectus 2012-13
Newcastle-under-Lyme College\'s Full-time prospectus 2012-13Newcastle-under-Lyme College\'s Full-time prospectus 2012-13
Newcastle-under-Lyme College\'s Full-time prospectus 2012-13
 
Flexible heterogenous replication
Flexible heterogenous replicationFlexible heterogenous replication
Flexible heterogenous replication
 
Tdah pautas profesorado
Tdah pautas profesoradoTdah pautas profesorado
Tdah pautas profesorado
 
Film Trailer Questionnaire Results
Film Trailer Questionnaire ResultsFilm Trailer Questionnaire Results
Film Trailer Questionnaire Results
 
Disaster Recovery with MySQL and Tungsten
Disaster Recovery with MySQL and TungstenDisaster Recovery with MySQL and Tungsten
Disaster Recovery with MySQL and Tungsten
 
Saint patrick’s day
Saint patrick’s daySaint patrick’s day
Saint patrick’s day
 
Evaluation Activity 1
Evaluation Activity 1Evaluation Activity 1
Evaluation Activity 1
 
Build simple and complex replication clusters
Build simple and complex replication clustersBuild simple and complex replication clusters
Build simple and complex replication clusters
 
Questionnaire results
Questionnaire resultsQuestionnaire results
Questionnaire results
 
Ingles
InglesIngles
Ingles
 
Tdah pautas profesorado
Tdah pautas profesoradoTdah pautas profesorado
Tdah pautas profesorado
 
Os programas de cualificación profesional inicial
Os programas de cualificación profesional inicialOs programas de cualificación profesional inicial
Os programas de cualificación profesional inicial
 
CONTRIBUCIÓN DE LAS LENGUAS EXTRANJERAS AL DESARROLLO DE ACTITUDES DE TOLERAN...
CONTRIBUCIÓN DE LAS LENGUAS EXTRANJERAS AL DESARROLLO DE ACTITUDES DE TOLERAN...CONTRIBUCIÓN DE LAS LENGUAS EXTRANJERAS AL DESARROLLO DE ACTITUDES DE TOLERAN...
CONTRIBUCIÓN DE LAS LENGUAS EXTRANJERAS AL DESARROLLO DE ACTITUDES DE TOLERAN...
 
The Help vs. Prometheus
The Help vs. PrometheusThe Help vs. Prometheus
The Help vs. Prometheus
 
Guía fp básica para as familias
Guía fp básica para as familiasGuía fp básica para as familias
Guía fp básica para as familias
 
Human Chromosomes and Chromosome Behavior
Human Chromosomes and Chromosome BehaviorHuman Chromosomes and Chromosome Behavior
Human Chromosomes and Chromosome Behavior
 
Youblisher.com 1406032-cuaderno de-informaci_n_acad_mica_para_2_de_eso_2016 (2)
Youblisher.com 1406032-cuaderno de-informaci_n_acad_mica_para_2_de_eso_2016 (2)Youblisher.com 1406032-cuaderno de-informaci_n_acad_mica_para_2_de_eso_2016 (2)
Youblisher.com 1406032-cuaderno de-informaci_n_acad_mica_para_2_de_eso_2016 (2)
 

Semelhante a Consolidated shared indexes in real time

Replication features, technologies and 3rd party Extinction
Replication features, technologies and 3rd party ExtinctionReplication features, technologies and 3rd party Extinction
Replication features, technologies and 3rd party Extinction
Ben Mildren
 
Datastage Online Training @ Adithya Elearning
Datastage Online Training @ Adithya ElearningDatastage Online Training @ Adithya Elearning
Datastage Online Training @ Adithya Elearning
shanmukha rao dondapati
 
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (1/3)
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (1/3)[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (1/3)
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (1/3)
Carles Farré
 
Ebs dba con4696_pdf_4696_0001
Ebs dba con4696_pdf_4696_0001Ebs dba con4696_pdf_4696_0001
Ebs dba con4696_pdf_4696_0001
jucaab
 
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (2/3)
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (2/3)[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (2/3)
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (2/3)
Carles Farré
 

Semelhante a Consolidated shared indexes in real time (20)

Replication features, technologies and 3rd party Extinction
Replication features, technologies and 3rd party ExtinctionReplication features, technologies and 3rd party Extinction
Replication features, technologies and 3rd party Extinction
 
Tungsten University: Setup and Operate Tungsten Replicators
Tungsten University: Setup and Operate Tungsten ReplicatorsTungsten University: Setup and Operate Tungsten Replicators
Tungsten University: Setup and Operate Tungsten Replicators
 
Ultimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on KubernetesUltimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on Kubernetes
 
NoSQL and ACID
NoSQL and ACIDNoSQL and ACID
NoSQL and ACID
 
Datastage Online Training @ Adithya Elearning
Datastage Online Training @ Adithya ElearningDatastage Online Training @ Adithya Elearning
Datastage Online Training @ Adithya Elearning
 
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (1/3)
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (1/3)[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (1/3)
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (1/3)
 
ASP.NET MVC Performance
ASP.NET MVC PerformanceASP.NET MVC Performance
ASP.NET MVC Performance
 
Multi-tenancy with Rails
Multi-tenancy with RailsMulti-tenancy with Rails
Multi-tenancy with Rails
 
Zeebe - a Microservice Orchestration Engine
Zeebe - a Microservice Orchestration Engine Zeebe - a Microservice Orchestration Engine
Zeebe - a Microservice Orchestration Engine
 
202201 AWS Black Belt Online Seminar Apache Spark Performnace Tuning for AWS ...
202201 AWS Black Belt Online Seminar Apache Spark Performnace Tuning for AWS ...202201 AWS Black Belt Online Seminar Apache Spark Performnace Tuning for AWS ...
202201 AWS Black Belt Online Seminar Apache Spark Performnace Tuning for AWS ...
 
All Aboard the Databus
All Aboard the DatabusAll Aboard the Databus
All Aboard the Databus
 
Setup & Operate Tungsten Replicator
Setup & Operate Tungsten ReplicatorSetup & Operate Tungsten Replicator
Setup & Operate Tungsten Replicator
 
Ebs dba con4696_pdf_4696_0001
Ebs dba con4696_pdf_4696_0001Ebs dba con4696_pdf_4696_0001
Ebs dba con4696_pdf_4696_0001
 
Cloud-Native Modernization or Death? A false dichotomy. | DevNation Tech Talk
Cloud-Native Modernization or Death? A false dichotomy. | DevNation Tech TalkCloud-Native Modernization or Death? A false dichotomy. | DevNation Tech Talk
Cloud-Native Modernization or Death? A false dichotomy. | DevNation Tech Talk
 
Jdk.io cloud native business automation
Jdk.io cloud native business automationJdk.io cloud native business automation
Jdk.io cloud native business automation
 
Memonic Architecture
Memonic ArchitectureMemonic Architecture
Memonic Architecture
 
Hyperion Financial Management Application Design for Performance
Hyperion Financial Management Application Design for PerformanceHyperion Financial Management Application Design for Performance
Hyperion Financial Management Application Design for Performance
 
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (2/3)
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (2/3)[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (2/3)
[DSBW Spring 2009] Unit 07: WebApp Design Patterns & Frameworks (2/3)
 
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...
 
Sweet Streams (Are made of this)
Sweet Streams (Are made of this)Sweet Streams (Are made of this)
Sweet Streams (Are made of this)
 

Consolidated shared indexes in real time

  • 1. Consolidated Sharded Indexes in Real-Time Jeff Mace ©Continuent 2012.
  • 2. About Continuent • The leading provider of clustering and replication for open source DBMS • Tungsten Clustering - Commercial-grade HA, performance scaling and data management for MySQL • Tungsten Replication - Flexible, high- performance replication ©Continuent 2012 2
  • 3. Working with Sharded Data • Schema sharding is a proven strategy for many SaaS application providers • Easily add new servers to handle growth • Costly operations by one customer do not affect all other customers • But it’s difficult to work with data spread across all schemas ©Continuent 2012 3
  • 4. An Example • A franchise business with many locations • Needs a consolidated table of all accounts and their balance is needed for BI • Some lag is ok but not more than a few seconds • This example can be applied to many scenarios using the same techniques ©Continuent 2012 4
  • 5. An Example NYC Branch SFO Branch id balance id balance 1 $234.78 1 $820.20 2 $892.24 2 $240.27 3 $1,023.76 3 $527.63 4 $521.08 5 $982.62 ©Continuent 2012 5
  • 6. An Example id balance schema 1 $234.78 nyc 2 $892.24 nyc 1 $820.20 sfo 2 $240.27 sfo 3 $1,023.76 nyc 4 $521.08 nyc 5 $982.62 nyc 3 $527.63 sfo ©Continuent 2012 6
  • 7. What can we do about it? • Make the application do it • Run a batch process to dump and load data • Replicate into a central schema with Tungsten Replicator ©Continuent 2012 7
  • 8. Tungsten Replicator • GPL v2 • Global Transaction IDs • Multi-Master Replication • Parallel Replication • Heterogenous Replication • Supports MySQL 5.0 and up ©Continuent 2012 8
  • 9. Tungsten Replicator Master-Slave Heterogenous Direct Fan-In All-Masters Star ©Continuent 2012 9
  • 10. Replication Services Stage Stage Stage Extrac Filter Apply Extrac Filter Apply Extrac Filter Apply Master Transaction In-Memory Slave DBMS History Log Queue DBMS ©Continuent 2012 10
  • 11. Transaction History Log Event Header Seqno Header Record Fragno Last_frag (version & first seqno) Epoch_number Source_id Event_id Event Record Shard_id Tstamp Data_length Event Record (Java Primitive Types) ... Serialized Event (Google Protobufs, Event Record Up to 2Gb) Log Rotation Record CRC (name of next file) (Java Primitive Types) ©Continuent 2012 11
  • 12. Filters • Modify THL events during replication • Can be written in Java or JavaScript • JavaScript is compiled at runtime • Drop all or part of a THL event • Modify the contents of a THL event • Insert new statements or rows into a THL event • Use your imagination ©Continuent 2012 12
  • 13. Putting It All Together • Filtering allows us to modify ROW replication events before applying them • dropstatementdata • replicate • replicatecolumns • buildindextable • Apply to the master server with log slave updates • Integrate with other servers to ensure ©Continuent 2012 13
  • 14. Drop Statement Data Filter Definition # Remove statement events and drop the event if the result is empty replicator.filter.dropstatementdata=com.continuent.tungsten.re plicator.filter.JavaScriptFilter replicator.filter.dropstatementdata.script=$ {replicator.home.dir}/samples/extensions/javascript/ dropstatementdata.js ©Continuent 2012 14
  • 15. Replicate Filter Definition # Filter to forward or ignore particular schemas and/or databases. Entries # are comma-separated lists of the form schema[.table] where the table is # optional. List entries may use * and ? as wild cards. When both # filter lists are empty updates on all tables are allowed. replicator.filter.replicate=com.continuent.tungsten.replicator.fil ter.ReplicateFilter replicator.filter.replicate.ignore= replicator.filter.replicate.do=*.account ©Continuent 2012 15
  • 16. ReplicateColumns Filter Definition # Join rows from one table in many schemas, to a table in a single schema replicator.filter.replicatecolumns=com.continuent.tungsten.repl icator.filter.ReplicateColumnsFilter replicator.filter.replicatecolumns.do= replicator.filter.replicatecolumns.ignore=account.filler ©Continuent 2012 16
  • 17. ReplicateColumns : Before - SQL(27) = - ACTION = INSERT - SCHEMA = e3 - TABLE = account - ROW# = 0 - COL(1: account_id) = 27 - COL(2: branch_id) = 0 - COL(3: account_balance) = 0 - COL(4: filler) = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XX - COL(5: time_stamp) = 2012-09-28 11:50:36.0 ©Continuent 2012 17
  • 18. ReplicateColumns : After - SQL(27) = - ACTION = INSERT - SCHEMA = e3 - TABLE = account - ROW# = 0 - COL(1: account_id) = 27 - COL(2: branch_id) = 0 - COL(3: account_balance) = 0 - COL(5: time_stamp) = 2012-09-28 11:50:36.0 ©Continuent 2012 18
  • 19. Index Filter Definition # Join rows from one table in many schemas, to a table in a single schema replicator.filter.buildindextable=com.continuent.tungsten.repli cator.filter.BuildIndexTable replicator.filter.buildindextable.target_schema_name=bi ©Continuent 2012 19
  • 20. Index Filter : Before - SQL(27) = - ACTION = INSERT - SCHEMA = e3 - TABLE = account - ROW# = 0 - COL(1: account_id) = 27 - COL(2: branch_id) = 0 - COL(3: account_balance) = 0 - COL(5: time_stamp) = 2012-09-28 11:50:36.0 ©Continuent 2012 20
  • 21. Index Filter : After - SQL(27) = - ACTION = INSERT - SCHEMA = test - TABLE = account - ROW# = 0 - COL(0: account_id) = 27 - COL(0: branch_id) = 0 - COL(0: account_balance) = 0 - COL(0: time_stamp) = 2012-09-28 11:50:36.0 - COL(0: schema) = e3 ©Continuent 2012 21
  • 22. Requirements • Standard Tungsten Requirements • Java • Ruby • Row Replication • Unique schema names • Target schema with an additional ‘schema’ column ©Continuent 2012 22
  • 23. Installation - Master • Normal ©Continuent 2012 23
  • 24. Installation - Slave • Some hacks to write back to the master • Some filters to make the changes ©Continuent 2012 24
  • 26. Replicating the index table • Setup a normal master-slave pair • Could be a Tungsten cluster • Set up the indexing replicator to apply events to the master or through the connector ©Continuent 2012 26
  • 27. Provisioning • Not as simple as restoring a backup • New loader command to initialize the master THL or slave database • Creates a set of row inserts for each table • Supports extracting from MySQL • Uses FLUSH TABLES WITH READ LOCK ©Continuent 2012 27
  • 28. Provisioning Master THL $ /opt/continuent/tungsten/tungsten-replicator/bin/loader -extractor com.continuent.tungsten.replicator.loader.MySQLLoader -extractor.uri "mysql://mdb-1.local:3306/" -extractor.user tungsten -extractor.password secret -extractor.includeSchemas e1,e2,e3,e4,e5 ©Continuent 2012 28
  • 29. Provisioning Slave Database $ /opt/indexer/tungsten/tungsten-replicator/bin/loader -extractor com.continuent.tungsten.replicator.loader.MySQLLoader -extractor.uri "mysql://mdb-2.local:3306/" -extractor.user tungsten -extractor.password secret -extractor.includeSchemas e1,e2,e3,e4,e5 -extractor.tungstenServiceSchema tungsten_globalidx ©Continuent 2012 29
  • 31. Next Steps • Loading big data • Replicate from many masters • Support non-row statements • Drop schema • Drop table • Truncate table • Expand the provisioning support to extract from Oracle • https://docs.continuent.com/wiki/x/uIAz ©Continuent 2012 31
  • 32. We’re Hiring • Cluster Implementation Engineer • Quality Assurance Engineer • Technical Writer ©Continuent 2012 32
  • 33. Jeff Mace jeff.mace@continuent.com sales@continuent.com 560 S. Winchester Blvd. Suite 500 San Jose, CA 95128 Tel (866) 998-3642 Fax (408) 668-1009 http://www.continuent.com http://code.google.com/p/tungsten-replicator ©Continuent 2012 19 33

Notas do Editor

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n