SlideShare uma empresa Scribd logo
1 de 35
Get More Out of
MySQL with TokuDB
Tim Callaghan
VP/Engineering, Tokutek
tim@tokutek.com
@tmcallaghan
Tokutek: Database Performance Engines
What is Tokutek?
Tokutek® offers high performance and scalability for MySQL,
MariaDB and MongoDB. Our easy-to-use open source solutions
are compatible with your existing code and application
infrastructure.
Tokutek Performance Engines Remove Limitations
• Improve insertion performance by 20X
• Reduce HDD and flash storage requirements up to 90%
• No need to rewrite code
Tokutek Mission:
Empower your database to handle the Big Data requirements of
today’s applications
3
A Global Customer Base
Housekeeping
• This presentation will be available for replay
following the event
• We welcome your questions; please use the console
on the right of your screen and we will answer
following the presentation
• A copy of the presentation is available upon request
Agenda
Lets answer the following questions, “How can you…?”
• Easily install and configure TokuDB.
• Dramatically increase performance without rewriting
code.
• Reduce the total cost of your servers and storage.
• Simply perform online schema changes.
• Avoid becoming the support staff for your
application.
• And Q+A
How easy is it to install and configure
TokuDB for
MySQL or MariaDB?
What is TokuDB?
• TokuDB = MySQL* Storage Engine + Patches**
– * MySQL, MariaDB, Percona Server
– ** Patches are required for full functionality
– TokuDB is more than a plugin
• Transactional, ACID + MVCC
– Like InnoDB
• Drop-in replacement for MySQL
• Open Source
– http://github.com/Tokutek/ft-engine
Where can I get TokuDB?
• Tokutek offers MySQL 5.5 and MariaDB 5.5 builds
– www.tokutek.com
• MariaDB 5.5 and 10
– www.mariadb.org
– Also in MariaDB 5.5 from various package repositories
• Experimental Percona Server 5.6 builds
– www.percona.com
Is it truly a “drop in replacement”?
• No Foreign Key support
– you’ll need to drop them
• No Windows or OSX binaries
– Virtual machines are helpful in evaluations
• No 32-bit builds
• Otherwise, yes
How do I get started?
• Start Fresh
– create table <table> engine=tokudb;
– mysqldump / load data infile
• Use your existing MySQL data folder
– alter table <table-to-convert> engine=tokudb;
• Measure the differences
– compression : load/convert your tables
– performance : run your workload
– online schema changes : add a column
Before you dive in – check you’re my.cnf
• TokuDB uses sensible server parameter defaults, but
• Be mindful of your memory
– Reduce innodb_buffer_pool_size (InnoDB) and
key_cache_size (MyISAM)
– Especially if converting tables
– tokudb_cache_size=?G
– Defaults to 50% of RAM, I recommend 80%
– tokudb_directio=1
• Leave everything else alone
How can I dramatically increase
performance without having to rewrite
code?
Where does the performance come from?
• Tokutek’s Fractal Tree® indexes
– Much faster than B-trees in > RAM workloads
– InnoDB and MyISAM use B-trees
– Significant IO reduction
– Messages defer IO on add/update/delete
– All reads and writes are compressed
– Enables users to add more indexes
– Queries go faster
• Lots of good webinar content on our website
– www.tokutek.com/resources/webinars
How much can I reduce my IO?
Converted from
InnoDB to TokuDB
How fast can I insert data into TokuDB?
• InnoDB’s B-trees
– Fast until the index not longer fits in RAM
• TokuDB’s Fractal Tree indexes
– Start fast, stay fast!
• iiBench benchmark
– Insert 1 billion rows
– 1000 inserts per batch
– Auto-increment PK
– 3 secondary indexes
How fast can I insert data into TokuDB?
How fast are mixed workloads?
• Fast, since > RAM mixed workloads generally contain…
– Index maintenance (insert, update, delete)
– Fractal Tree indexes FTW!
– Queries
– TokuDB enables richer indexing (more indexes)
• Sysbench benchmark
– 16 tables, 50 million rows per table
– Each Sysbench transaction contains
– 1 of each query : point, range, aggregation
– indexed update, unindexed update, delete, insert
How fast are mixed workloads?
How do secondary indexes work?
• InnoDB and TokuDB “cluster” the primary key index
– The key (PK) and all other columns are co-located in
memory and on disk
• Secondary indexes co-locate the “index key” and PK
– When a candidate row is found a second lookup
occurs into the PK index
– This means an additional IO is required
– MySQL’s “hidden join”
What is a clustered secondary index?
• “Covering” indexes remove this second lookup, but
require putting the right columns into the index
– create index idx_1 on t1 (c1, c2, c3, c4, c5, c6);
– If c1/c2 are queried, only c3/c4/c5/c6 are covered
– No additional IO, but c7 isn’t covered
• TokuDB supports clustered secondary indexes
– create clustering index idx_1 on t1 (c1, c2);
– All columns in t1 are covered, forever
– Even if new columns are added to the table
What are clustered secondary indexes good at?
• Two words, “RANGE SCANS”
• Several rows (maybe thousands) are scanned without
requiring additional lookups on the PK index
• Also, TokuDB blocks are much larger than InnoDB
– TokuDB = 4MB blocks = sequential IO
– InnoDB = 16KB blocks = random IO
• Can be orders of magnitude faster for range queries
Can SQL be optimized?
• Fractal Tree indexes support message injection
– The actual work (and IO) can be deferred
• Example:
– update t1 set k = k + 1 where pk = 5;
– InnoDB follows read-modify-write pattern
– If field “k” is not indexed, TokuDB avoids IO entirely
– An “increment” message is injected
• Current optimizations
– “replace into”, “insert ignore”, “update”, “insert on
duplicate key update”
How can I reduce the total cost of my
servers and storage?
How can I use less storage?
• Compression, compression, compression!
• All IO in TokuDB is compressed
– Reads and writes
– Usually ~5x compression (but I’ve seen 25x or more)
• TokuDB [currently] supports 3 compression algorithms
– lzma = highest compression (and high CPU)
– zlib = high compression (and much less CPU)
– quicklz = medium compression (even less CPU)
– pluggable architecture, lz4 and snappy “in the lab”
But doesn’t InnoDB support compression?
• Yes, but the compression achieved is far lower
– InnoDB compresses 16K blocks, TokuDB is 64K or 128K
– InnoDB requires fixed on-disk size, TokuDB is flexible
*log style data
But doesn’t InnoDB support compression?
• And InnoDB performance is severely impacted by it
– Compression “misses” are costly
*iiBench workload
How do I compress my data in TokuDB?
create table t1 (c1 bigint not null primary key)
engine=tokudb
row_format=[tokudb_lzma | tokudb_zlib | tokudb_quicklz];
NOTE: Compression is not optional in TokuDB, we use
compression to provide performance advantages as well as save
space.
How can I perform online schema
changes?
What is an “online” schema change?
My definition
“An online schema change is the ability to add or drop a column
on an existing table without blocking further changes to the
table or requiring substantial server resources (CPU, RAM, IO,
disk) to accomplish the operation.”
P.S., I’d like for it to be instantaneous!
What do blocking schema changes look like?
How have online schema changes evolved?
• MySQL 5.5
– Table is read-only while entire table is re-created
• “Manual” process
– Take slave offline, apply to slave, catch up to master,
switch places, repeat
• MySQL 5.6 (and ~ Percona’s pt-online-schema-change-tool)
– Table is rebuilt “in the background”
– Changes are captured, and replayed on new table
– Uses significant RAM, CPU, IO, and disk space
• TokuDB
– alter table t1 add column new_column bigint;
– Done!
What online schema changes can TokuDB handle?
• Add column
• Drop column
• Expand column
– integer types
– varchar, char, varbinary
• Index creation
How can I avoid becoming the support
staff for my application?
34
TokuDB is offered in 2 editions
• Community
– Community support (Google Groups “tokudb-user”)
• Enterprise subscription
– Commercial support
– Wouldn’t you rather be developing another application?
– Extra features
– Hot backup, more on the way
– Access to TokuDB experts
– Input to the product roadmap
Where can I get TokuDB support?
35
Tokutek: Database Performance Engines
Any Questions?
Download TokuDB at www.tokutek.com/products/downloads
Register for product updates, access to premium content, and
invitations at www.tokutek.com
Join the Conversation

Mais conteúdo relacionado

Mais procurados

In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentationMichael Keane
 
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)Masao Fujii
 
An introduction to SQL Server in-memory OLTP Engine
An introduction to SQL Server in-memory OLTP EngineAn introduction to SQL Server in-memory OLTP Engine
An introduction to SQL Server in-memory OLTP EngineKrishnakumar S
 
What'sNnew in 3.0 Webinar
What'sNnew in 3.0 WebinarWhat'sNnew in 3.0 Webinar
What'sNnew in 3.0 WebinarMongoDB
 
InnoDB Architecture and Performance Optimization, Peter Zaitsev
InnoDB Architecture and Performance Optimization, Peter ZaitsevInnoDB Architecture and Performance Optimization, Peter Zaitsev
InnoDB Architecture and Performance Optimization, Peter ZaitsevFuenteovejuna
 
When is MyRocks good?
When is MyRocks good? When is MyRocks good?
When is MyRocks good? Alkin Tezuysal
 
in-memory database system and low latency
in-memory database system and low latencyin-memory database system and low latency
in-memory database system and low latencyhyeongchae lee
 
M|18 How to use MyRocks with MariaDB Server
M|18 How to use MyRocks with MariaDB ServerM|18 How to use MyRocks with MariaDB Server
M|18 How to use MyRocks with MariaDB ServerMariaDB plc
 
Some key value stores using log-structure
Some key value stores using log-structureSome key value stores using log-structure
Some key value stores using log-structureZhichao Liang
 
Beyond Postgres: Interesting Projects, Tools and forks
Beyond Postgres: Interesting Projects, Tools and forksBeyond Postgres: Interesting Projects, Tools and forks
Beyond Postgres: Interesting Projects, Tools and forksSameer Kumar
 
Getting innodb compression_ready_for_facebook_scale
Getting innodb compression_ready_for_facebook_scaleGetting innodb compression_ready_for_facebook_scale
Getting innodb compression_ready_for_facebook_scaleNizameddin Ordulu
 
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLYoshinori Matsunobu
 
505 kobal exadata
505 kobal exadata505 kobal exadata
505 kobal exadataKam Chan
 
2016 jan-pugs-meetup-v9.5-features
2016 jan-pugs-meetup-v9.5-features2016 jan-pugs-meetup-v9.5-features
2016 jan-pugs-meetup-v9.5-featuresSameer Kumar
 
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)MongoDB
 

Mais procurados (18)

In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentation
 
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
 
An introduction to SQL Server in-memory OLTP Engine
An introduction to SQL Server in-memory OLTP EngineAn introduction to SQL Server in-memory OLTP Engine
An introduction to SQL Server in-memory OLTP Engine
 
Fudcon talk.ppt
Fudcon talk.pptFudcon talk.ppt
Fudcon talk.ppt
 
What'sNnew in 3.0 Webinar
What'sNnew in 3.0 WebinarWhat'sNnew in 3.0 Webinar
What'sNnew in 3.0 Webinar
 
InnoDB Architecture and Performance Optimization, Peter Zaitsev
InnoDB Architecture and Performance Optimization, Peter ZaitsevInnoDB Architecture and Performance Optimization, Peter Zaitsev
InnoDB Architecture and Performance Optimization, Peter Zaitsev
 
When is MyRocks good?
When is MyRocks good? When is MyRocks good?
When is MyRocks good?
 
In-memory Databases
In-memory DatabasesIn-memory Databases
In-memory Databases
 
in-memory database system and low latency
in-memory database system and low latencyin-memory database system and low latency
in-memory database system and low latency
 
M|18 How to use MyRocks with MariaDB Server
M|18 How to use MyRocks with MariaDB ServerM|18 How to use MyRocks with MariaDB Server
M|18 How to use MyRocks with MariaDB Server
 
Some key value stores using log-structure
Some key value stores using log-structureSome key value stores using log-structure
Some key value stores using log-structure
 
Beyond Postgres: Interesting Projects, Tools and forks
Beyond Postgres: Interesting Projects, Tools and forksBeyond Postgres: Interesting Projects, Tools and forks
Beyond Postgres: Interesting Projects, Tools and forks
 
Getting innodb compression_ready_for_facebook_scale
Getting innodb compression_ready_for_facebook_scaleGetting innodb compression_ready_for_facebook_scale
Getting innodb compression_ready_for_facebook_scale
 
PostgreSQL and MySQL
PostgreSQL and MySQLPostgreSQL and MySQL
PostgreSQL and MySQL
 
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQL
 
505 kobal exadata
505 kobal exadata505 kobal exadata
505 kobal exadata
 
2016 jan-pugs-meetup-v9.5-features
2016 jan-pugs-meetup-v9.5-features2016 jan-pugs-meetup-v9.5-features
2016 jan-pugs-meetup-v9.5-features
 
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
 

Semelhante a Get More Out of MySQL with TokuDB

20140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp02
20140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp0220140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp02
20140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp02Francisco Gonçalves
 
Performance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons LearnedPerformance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons LearnedTim Callaghan
 
Hekaton introduction for .Net developers
Hekaton introduction for .Net developersHekaton introduction for .Net developers
Hekaton introduction for .Net developersShy Engelberg
 
5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDBTim Callaghan
 
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte DataProblems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte DataJignesh Shah
 
MySQL Performance - Best practices
MySQL Performance - Best practices MySQL Performance - Best practices
MySQL Performance - Best practices Ted Wennmark
 
30334823 my sql-cluster-performance-tuning-best-practices
30334823 my sql-cluster-performance-tuning-best-practices30334823 my sql-cluster-performance-tuning-best-practices
30334823 my sql-cluster-performance-tuning-best-practicesDavid Dhavan
 
Software Engineering Advice from Google's Jeff Dean for Big, Distributed Systems
Software Engineering Advice from Google's Jeff Dean for Big, Distributed SystemsSoftware Engineering Advice from Google's Jeff Dean for Big, Distributed Systems
Software Engineering Advice from Google's Jeff Dean for Big, Distributed Systemsadrianionel
 
Real World Performance - Data Warehouses
Real World Performance - Data WarehousesReal World Performance - Data Warehouses
Real World Performance - Data WarehousesConnor McDonald
 
AWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data AnalyticsAWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data AnalyticsKeeyong Han
 
Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...
Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...
Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...Andrejs Karpovs
 
Innodb 和 XtraDB 结构和性能优化
Innodb 和 XtraDB 结构和性能优化Innodb 和 XtraDB 结构和性能优化
Innodb 和 XtraDB 结构和性能优化YUCHENG HU
 
[db tech showcase Tokyo 2014] B15: Scalability with MariaDB and MaxScale by ...
[db tech showcase Tokyo 2014] B15: Scalability with MariaDB and MaxScale  by ...[db tech showcase Tokyo 2014] B15: Scalability with MariaDB and MaxScale  by ...
[db tech showcase Tokyo 2014] B15: Scalability with MariaDB and MaxScale by ...Insight Technology, Inc.
 
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_SummaryHiram Fleitas León
 
Beyond the DSL - Unlocking the power of Kafka Streams with the Processor API
Beyond the DSL - Unlocking the power of Kafka Streams with the Processor APIBeyond the DSL - Unlocking the power of Kafka Streams with the Processor API
Beyond the DSL - Unlocking the power of Kafka Streams with the Processor APIconfluent
 
InnoDB architecture and performance optimization (Пётр Зайцев)
InnoDB architecture and performance optimization (Пётр Зайцев)InnoDB architecture and performance optimization (Пётр Зайцев)
InnoDB architecture and performance optimization (Пётр Зайцев)Ontico
 
SQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - AdvancedSQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - AdvancedTony Rogerson
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataHakka Labs
 

Semelhante a Get More Out of MySQL with TokuDB (20)

20140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp02
20140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp0220140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp02
20140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp02
 
Performance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons LearnedPerformance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons Learned
 
Hekaton introduction for .Net developers
Hekaton introduction for .Net developersHekaton introduction for .Net developers
Hekaton introduction for .Net developers
 
5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB
 
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte DataProblems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
 
MySQL Performance - Best practices
MySQL Performance - Best practices MySQL Performance - Best practices
MySQL Performance - Best practices
 
30334823 my sql-cluster-performance-tuning-best-practices
30334823 my sql-cluster-performance-tuning-best-practices30334823 my sql-cluster-performance-tuning-best-practices
30334823 my sql-cluster-performance-tuning-best-practices
 
Software Engineering Advice from Google's Jeff Dean for Big, Distributed Systems
Software Engineering Advice from Google's Jeff Dean for Big, Distributed SystemsSoftware Engineering Advice from Google's Jeff Dean for Big, Distributed Systems
Software Engineering Advice from Google's Jeff Dean for Big, Distributed Systems
 
Real World Performance - Data Warehouses
Real World Performance - Data WarehousesReal World Performance - Data Warehouses
Real World Performance - Data Warehouses
 
AWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data AnalyticsAWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data Analytics
 
Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...
Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...
Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...
 
Innodb 和 XtraDB 结构和性能优化
Innodb 和 XtraDB 结构和性能优化Innodb 和 XtraDB 结构和性能优化
Innodb 和 XtraDB 结构和性能优化
 
Migrate.pdf
Migrate.pdfMigrate.pdf
Migrate.pdf
 
[db tech showcase Tokyo 2014] B15: Scalability with MariaDB and MaxScale by ...
[db tech showcase Tokyo 2014] B15: Scalability with MariaDB and MaxScale  by ...[db tech showcase Tokyo 2014] B15: Scalability with MariaDB and MaxScale  by ...
[db tech showcase Tokyo 2014] B15: Scalability with MariaDB and MaxScale by ...
 
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
 
Beyond the DSL - Unlocking the power of Kafka Streams with the Processor API
Beyond the DSL - Unlocking the power of Kafka Streams with the Processor APIBeyond the DSL - Unlocking the power of Kafka Streams with the Processor API
Beyond the DSL - Unlocking the power of Kafka Streams with the Processor API
 
InnoDB architecture and performance optimization (Пётр Зайцев)
InnoDB architecture and performance optimization (Пётр Зайцев)InnoDB architecture and performance optimization (Пётр Зайцев)
InnoDB architecture and performance optimization (Пётр Зайцев)
 
Howmysqlworks
HowmysqlworksHowmysqlworks
Howmysqlworks
 
SQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - AdvancedSQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - Advanced
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
 

Mais de Tim Callaghan

Is It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB PerformanceIs It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB PerformanceTim Callaghan
 
Benchmarking MongoDB for Fame and Fortune
Benchmarking MongoDB for Fame and FortuneBenchmarking MongoDB for Fame and Fortune
Benchmarking MongoDB for Fame and FortuneTim Callaghan
 
So you want to be a software developer? (version 2.0)
So you want to be a software developer? (version 2.0)So you want to be a software developer? (version 2.0)
So you want to be a software developer? (version 2.0)Tim Callaghan
 
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra GuruUse Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra GuruTim Callaghan
 
Use Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB GuruUse Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB GuruTim Callaghan
 
Creating a Benchmarking Infrastructure That Just Works
Creating a Benchmarking Infrastructure That Just WorksCreating a Benchmarking Infrastructure That Just Works
Creating a Benchmarking Infrastructure That Just WorksTim Callaghan
 
VoltDB : A Technical Overview
VoltDB : A Technical OverviewVoltDB : A Technical Overview
VoltDB : A Technical OverviewTim Callaghan
 

Mais de Tim Callaghan (7)

Is It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB PerformanceIs It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB Performance
 
Benchmarking MongoDB for Fame and Fortune
Benchmarking MongoDB for Fame and FortuneBenchmarking MongoDB for Fame and Fortune
Benchmarking MongoDB for Fame and Fortune
 
So you want to be a software developer? (version 2.0)
So you want to be a software developer? (version 2.0)So you want to be a software developer? (version 2.0)
So you want to be a software developer? (version 2.0)
 
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra GuruUse Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
 
Use Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB GuruUse Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB Guru
 
Creating a Benchmarking Infrastructure That Just Works
Creating a Benchmarking Infrastructure That Just WorksCreating a Benchmarking Infrastructure That Just Works
Creating a Benchmarking Infrastructure That Just Works
 
VoltDB : A Technical Overview
VoltDB : A Technical OverviewVoltDB : A Technical Overview
VoltDB : A Technical Overview
 

Último

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 

Último (20)

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 

Get More Out of MySQL with TokuDB

  • 1. Get More Out of MySQL with TokuDB Tim Callaghan VP/Engineering, Tokutek tim@tokutek.com @tmcallaghan
  • 2. Tokutek: Database Performance Engines What is Tokutek? Tokutek® offers high performance and scalability for MySQL, MariaDB and MongoDB. Our easy-to-use open source solutions are compatible with your existing code and application infrastructure. Tokutek Performance Engines Remove Limitations • Improve insertion performance by 20X • Reduce HDD and flash storage requirements up to 90% • No need to rewrite code Tokutek Mission: Empower your database to handle the Big Data requirements of today’s applications
  • 4. Housekeeping • This presentation will be available for replay following the event • We welcome your questions; please use the console on the right of your screen and we will answer following the presentation • A copy of the presentation is available upon request
  • 5. Agenda Lets answer the following questions, “How can you…?” • Easily install and configure TokuDB. • Dramatically increase performance without rewriting code. • Reduce the total cost of your servers and storage. • Simply perform online schema changes. • Avoid becoming the support staff for your application. • And Q+A
  • 6. How easy is it to install and configure TokuDB for MySQL or MariaDB?
  • 7. What is TokuDB? • TokuDB = MySQL* Storage Engine + Patches** – * MySQL, MariaDB, Percona Server – ** Patches are required for full functionality – TokuDB is more than a plugin • Transactional, ACID + MVCC – Like InnoDB • Drop-in replacement for MySQL • Open Source – http://github.com/Tokutek/ft-engine
  • 8. Where can I get TokuDB? • Tokutek offers MySQL 5.5 and MariaDB 5.5 builds – www.tokutek.com • MariaDB 5.5 and 10 – www.mariadb.org – Also in MariaDB 5.5 from various package repositories • Experimental Percona Server 5.6 builds – www.percona.com
  • 9. Is it truly a “drop in replacement”? • No Foreign Key support – you’ll need to drop them • No Windows or OSX binaries – Virtual machines are helpful in evaluations • No 32-bit builds • Otherwise, yes
  • 10. How do I get started? • Start Fresh – create table <table> engine=tokudb; – mysqldump / load data infile • Use your existing MySQL data folder – alter table <table-to-convert> engine=tokudb; • Measure the differences – compression : load/convert your tables – performance : run your workload – online schema changes : add a column
  • 11. Before you dive in – check you’re my.cnf • TokuDB uses sensible server parameter defaults, but • Be mindful of your memory – Reduce innodb_buffer_pool_size (InnoDB) and key_cache_size (MyISAM) – Especially if converting tables – tokudb_cache_size=?G – Defaults to 50% of RAM, I recommend 80% – tokudb_directio=1 • Leave everything else alone
  • 12. How can I dramatically increase performance without having to rewrite code?
  • 13. Where does the performance come from? • Tokutek’s Fractal Tree® indexes – Much faster than B-trees in > RAM workloads – InnoDB and MyISAM use B-trees – Significant IO reduction – Messages defer IO on add/update/delete – All reads and writes are compressed – Enables users to add more indexes – Queries go faster • Lots of good webinar content on our website – www.tokutek.com/resources/webinars
  • 14. How much can I reduce my IO? Converted from InnoDB to TokuDB
  • 15. How fast can I insert data into TokuDB? • InnoDB’s B-trees – Fast until the index not longer fits in RAM • TokuDB’s Fractal Tree indexes – Start fast, stay fast! • iiBench benchmark – Insert 1 billion rows – 1000 inserts per batch – Auto-increment PK – 3 secondary indexes
  • 16. How fast can I insert data into TokuDB?
  • 17. How fast are mixed workloads? • Fast, since > RAM mixed workloads generally contain… – Index maintenance (insert, update, delete) – Fractal Tree indexes FTW! – Queries – TokuDB enables richer indexing (more indexes) • Sysbench benchmark – 16 tables, 50 million rows per table – Each Sysbench transaction contains – 1 of each query : point, range, aggregation – indexed update, unindexed update, delete, insert
  • 18. How fast are mixed workloads?
  • 19. How do secondary indexes work? • InnoDB and TokuDB “cluster” the primary key index – The key (PK) and all other columns are co-located in memory and on disk • Secondary indexes co-locate the “index key” and PK – When a candidate row is found a second lookup occurs into the PK index – This means an additional IO is required – MySQL’s “hidden join”
  • 20. What is a clustered secondary index? • “Covering” indexes remove this second lookup, but require putting the right columns into the index – create index idx_1 on t1 (c1, c2, c3, c4, c5, c6); – If c1/c2 are queried, only c3/c4/c5/c6 are covered – No additional IO, but c7 isn’t covered • TokuDB supports clustered secondary indexes – create clustering index idx_1 on t1 (c1, c2); – All columns in t1 are covered, forever – Even if new columns are added to the table
  • 21. What are clustered secondary indexes good at? • Two words, “RANGE SCANS” • Several rows (maybe thousands) are scanned without requiring additional lookups on the PK index • Also, TokuDB blocks are much larger than InnoDB – TokuDB = 4MB blocks = sequential IO – InnoDB = 16KB blocks = random IO • Can be orders of magnitude faster for range queries
  • 22. Can SQL be optimized? • Fractal Tree indexes support message injection – The actual work (and IO) can be deferred • Example: – update t1 set k = k + 1 where pk = 5; – InnoDB follows read-modify-write pattern – If field “k” is not indexed, TokuDB avoids IO entirely – An “increment” message is injected • Current optimizations – “replace into”, “insert ignore”, “update”, “insert on duplicate key update”
  • 23. How can I reduce the total cost of my servers and storage?
  • 24. How can I use less storage? • Compression, compression, compression! • All IO in TokuDB is compressed – Reads and writes – Usually ~5x compression (but I’ve seen 25x or more) • TokuDB [currently] supports 3 compression algorithms – lzma = highest compression (and high CPU) – zlib = high compression (and much less CPU) – quicklz = medium compression (even less CPU) – pluggable architecture, lz4 and snappy “in the lab”
  • 25. But doesn’t InnoDB support compression? • Yes, but the compression achieved is far lower – InnoDB compresses 16K blocks, TokuDB is 64K or 128K – InnoDB requires fixed on-disk size, TokuDB is flexible *log style data
  • 26. But doesn’t InnoDB support compression? • And InnoDB performance is severely impacted by it – Compression “misses” are costly *iiBench workload
  • 27. How do I compress my data in TokuDB? create table t1 (c1 bigint not null primary key) engine=tokudb row_format=[tokudb_lzma | tokudb_zlib | tokudb_quicklz]; NOTE: Compression is not optional in TokuDB, we use compression to provide performance advantages as well as save space.
  • 28. How can I perform online schema changes?
  • 29. What is an “online” schema change? My definition “An online schema change is the ability to add or drop a column on an existing table without blocking further changes to the table or requiring substantial server resources (CPU, RAM, IO, disk) to accomplish the operation.” P.S., I’d like for it to be instantaneous!
  • 30. What do blocking schema changes look like?
  • 31. How have online schema changes evolved? • MySQL 5.5 – Table is read-only while entire table is re-created • “Manual” process – Take slave offline, apply to slave, catch up to master, switch places, repeat • MySQL 5.6 (and ~ Percona’s pt-online-schema-change-tool) – Table is rebuilt “in the background” – Changes are captured, and replayed on new table – Uses significant RAM, CPU, IO, and disk space • TokuDB – alter table t1 add column new_column bigint; – Done!
  • 32. What online schema changes can TokuDB handle? • Add column • Drop column • Expand column – integer types – varchar, char, varbinary • Index creation
  • 33. How can I avoid becoming the support staff for my application?
  • 34. 34 TokuDB is offered in 2 editions • Community – Community support (Google Groups “tokudb-user”) • Enterprise subscription – Commercial support – Wouldn’t you rather be developing another application? – Extra features – Hot backup, more on the way – Access to TokuDB experts – Input to the product roadmap Where can I get TokuDB support?
  • 35. 35 Tokutek: Database Performance Engines Any Questions? Download TokuDB at www.tokutek.com/products/downloads Register for product updates, access to premium content, and invitations at www.tokutek.com Join the Conversation

Notas do Editor

  1. #amanda
  2. #amanda
  3. #amanda