SlideShare uma empresa Scribd logo
1 de 55
Baixar para ler offline
Introduction to column stores
Justin Swanhart
Percona Live, April 2013
INTRODUCTION
2
Introduction
• Who am I?
• What do I do?
• Why am I here?
3
A quick survey
4
?How many people have heard the term row store?
A quick survey
5
?How many people have heard the term column store?
A quick survey
6
?How many people have heard the term grocery store?
(just kidding)
ROW STORES
7
What is a row store
• You've probably used one 
– MyISAM
– InnoDB
– Memory
– TokuDB
– Etc
8
What is a row store
• All MySQL storage engines are row stores
– The storage engine API is row oriented
– This means the API always stores and retrieves
entire rows from disk
9
Materialization
• Row stores live "in a material world"
– A row is a collection of column values (attributes)
– A row store stores all the values together on disk
• That is, rows are materialized on disk
10
Rows are stored in one file or segment on disk
11
id scientist death_by movie_name
1 Reinhardt Maximillian The Black Hole
2 Tyrell Roy Batty Blade Runner
3 Hammond Dinosaur Jurassic Park
4 Soong Lore Star Trek: TNG
5 Morbius His mind Forbidden Planet
6 Dyson Skynet Terminator 2: Judgment Day
All columns
are stored
together
on disk.
Best case scenario
12
id scientist death_by movie_name
1 Reinhardt Maximillian
2 Tyrell Roy Batty Blade Runner
3 Hammond Dinosaur Jurassic Park
4 Soong Lore Star Trek: TNG
5 Morbius Robby
6 Dyson Skynet Terminator 2: Judgment Day
select * from the_table where id = 6
• Performs best when a small number of rows are
accessed
BUT…
• Row stores not so great for wide rows
• It can be very bad if only a small subset of the
columns are actually used by most queries
–Reading the entire row wastes IO
13
Row Stores can waste IO
14
6 Dyson Skynet Terminator 2: Judgment Day
select movie_name from the_table where id = 6
Terminator 2: Judgment Day
1. Storage engine gets whole row*:
2. SQL interface extracts only requested portion
3. Results returned to client
* By primary key in this case
Worst case scenario
• select sum(bigint_column) from table
– 1000000 rows in table
– Average row length 1024 bytes
• The select reads one bigint column (8 bytes)
– But the entire row must be read to get one column
– Reads ~1GB of table data for ~8MB of column data
15
COLUMN STORES
16
Non-material world
• In a column stores rows are not materialized
during storage
– An entire row image exists only in memory
– Each row still has some sort of "row id"
17
What is a row?
• A row is a collection of column values that are
associated with one another
• Associated?
– Every row has some type of "row id"
• PK column or GEN_CLUST_INDEX on InnoDB
• Physical offset in a MyISAM table
• In some databases it is just a virtual identifier for a row
18
Column store stores each COLUMN on disk
19
Genre
Comedy
Comedy
Horror
Drama
Comedy
Drama
id
1
3
4
2
5
6
Title
Mrs. Doubtfire
The Big Lebowski
The Fly
Steel Magnolias
The Birdcage
Erin Brokovitch
Actor
Robin Williams
Jeff Bridges
Jeff Goldblum
Dolly Parton
Nathan Lane
Julia Roberts
row id = 1
row id = 6
Each column has a file or segment on diskNatural order may be
unusual
Column store benefit: Compression
• Almost all column stores perform compression
– Compression further reduces the storage footprint
of each column
– Column data type tailored compression
• RLE
• Integer packing
• Dictionary and lookup string compression
• Other (depends on column store)
20
Column store benefit: Compression
• Effective compression reduces storage cost
• IO reduction yields decreased response times
during queries as well
– Queries may execute an order of magnitude faster
compared to queries over the same data set on a
row store
• 10:1 to 30:1 compression rations may be seen
21
Best case scenario (IO savings)
• select sum(bigint_column) from table
– 1000000 rows in table
– Average row length 1024 bytes
• The select reads one bigint column (8 bytes)
– Only the single column is read from disk
– Reads ~8MB of column data instead of 1GB of
table data
– Could read even less with compression!
22
Less than ideal scenario*
select *
from long_wide_table
where order_line_id = 321837291;
23
All columns
Single row
* Possibly worst case. We'll see in the performance test as this varies by column store.
Column store downsides:
• Accessing all columns in a table doesn't save
anything
– It could even be more expensive than a row store
– Not ideal for tables with few columns
24
More downsides
• Updating and deleting rows is expensive
– Some column stores are append only
– Others strongly discourage writes
– Some column stores split storage into row and
column areas (in Vertica* this is called WOS/ROS)
25
*Not open source, based on C-Store an MIT project
OLTP VS OLAP
26
Analytics versus OLTP
• Row stores are for OLTP
– Reading small portions of a table, but often many
of the columns
– Frequent changes to data
– Small* amount of data (typically working set must
fit in ram)
– "Nested loops" joins are well optimized for OLTP
27
* < 2TB data sets in general
Analytics versus OLTP
• Column stores are designed for OLAP
– Read large portions of a table in terms of rows, but
often a small number of columns
– Batch loading / updates
– Big data*!
• Effective compression and reduced query IO make
column stores much more attractive for structured big
data
• Machine generated data is especially well suited
28
* 50TB – 100TB per machine is typically possible, before compression
Not all column stores are for big data!
• In-memory analytics is popular
– Some column stores are designed to be used as an
in-memory store
– It is possible to build very large SMP clusters which
scale these databases to many machines but this
outside of the scope of this talk
29
Indexing
• Row stores mainly use tree style indexes
– When you create an index on a row store you
usually create a "B" tree index
– These indexes leverage binary search
– Very fast as long as the index fits in memory
30
Tree indexes
• A read is necessary to satisfy a write
– Even with fast IO this is expensive
– This means that tree indexes work best for in-
memory data sets
– Very large data sets usually end up with indexes
which are unmanageably large
31
Use less trees: Bitmap indexing
• Some column stores support bitmap indexes
– Columnar storage lends very well to bitmaps
– Bitmap indexes are effective for Boolean searches
over multiple columns in a table
32
Use less trees: Bitmap indexing
• Unfortunately
– Bitmap indexes are not supported by any MySQL
column store
– Very expensive to update
33
OPEN SOURCE COLUMN STORES
34
Fastbit bitmap store
• Academic column store
– Simple text interface
– Allows bitmap indexes to be compared whilst still
compressed
– No joins
– Append-only* data store
35
* records can be marked as deleted with library calls, but not removed from disk
https://sdm.lbl.gov/fastbit/
MySQL column stores - ICE
• Infobright Community Edition (ICE)
– Some data type limitations
– Single threaded loader
– Single threaded query
– No temp tables, no RENAME table, no DDL
– LOAD DATA only (append only, no DML)
36
MySQL column stores – ICE [2]
• Data stored in "data packs"
• No indexes
– Knowledge grid stores statistics and information
about the data packs
• Decides which packs must be decompressed
• Can make very intelligent query decisions
• Hash join only
• No constraints
37
MySQL column stores – InfiniDB Community
• InfiniDB community edition
– Some data type limitations
– Single threaded query*
– Single threaded loader*
– No indexes
– Wrong results in testing
– Hash join only
38
* You are required to bind infinidb to only a single core
LucidDB
• Java based column store database
• SQL/MED data wrappers
• Built-in data versioning
• Commercial backing ended in October, 2012
– But source code still available under Apache
license
39
LucidDB [2]
• Bitmap and tree based indexes
– Support for foreign key, primary key and unique
constraints
• ANSI SQL support
• Multiple join algorithms
• Single-threaded queries
40
MonetDB
• Early (started circa 2004) column store
• Open source
• Designed for in-memory working sets
– Indexes must fit in memory
– Swap size must be large enough for working set if
working set exceeds memory
• Some wrong results during testing
41
MonetDB [2]
• Data is stored compressed on disk
• In memory structure is array-like
– Parallel query execution
– CPU cache optimizations
– multiple join algorithms
• Primary key, foreign key, unique and NOT NULL
constraints
42
PERFORMANCE COMPARISON
43
Loading Performance (InfiniDB omitted)
44
CUST DATE PART SUPP LINE ALL TABLES
MonetDB 0.83 0.26 1.980 0.23 125 128s
ICE 1.69 0.09 2.372 0.15 128 133s
LucidDB 2.49 0.31 5.915 0.96 297 307s
InnoDB 3.11 0.14 11.83 0.53 300 316s
Line
Cust
PartDate
Supp
Star Schema Benchmark
Scale Factor = 5
Lineorder rows = 29,999,795
Point Lookup Query
select *
from lineorder
where LO_OrderKey = 10000001
and LO_LineNumber = 3;
cold hot
MonetDB 0.081 0.001
ICE 0.29 0.07
LucidDB 2.4 0.9
InnoDB 0.01 0.001
45
Simple Aggregate Query
select sum(LO_OrdTotalPrice) from lineorder;
cold hot
ICE 0.02 0.001
MonetDB 0.2 0.07
LucidDB 5.9 3.9
InnoDB 34 9.3
46
Complex Query with no join
select LO_CustKey, sum(LO_OrdTotalPrice)
from lineorder
group by LO_CustKey
having sum(LO_OrdTotalPrice) >= 11349901235
order by sum(LO_OrdTotalPrice) desc;
cold hot
MonetDB 3.9 1.5
ICE 7.19 3.14
LucidDB 17.5 13.3
InnoDB 41.18 33.81
47
Complex Query with join, group by and filter
cold hot
ICE 6.9 1.83
LucidDB 17.3 15.3
InnoDB 58.23 51.62
MonetDB ERR ERR
48
Wrong results
select d_year, p_brand, sum(lo_revenue)
from lineorder
join dim_date on lo_orderdate = d_datekey
join part on lo_partkey = p_partkey
join supplier on lo_suppkey = s_suppkey
where p_category = 'MFGR#12'
and s_region = 'AMERICA'
group by d_year, p_brand
order by d_year, p_brand;
CLOSED SOURCE COLUMN STORES
49
Infobright Enterprise Edition – IEE
– Supports most MySQL SQL
– Parallel loader / binary loader
– Supports DML (insert, update, delete)
– Supports DDL (rename, truncate)
– Supports temp tables
– Parallel query
– Support
50
Vectorwise
• In memory SMP column store
• Vector execution architecture for parallel
pipelined MPP query execution
• Very fast
• Uses ScaleMP to scale to more than one
machine
51
Vertica
• An HP company
• Originally C-Store
• Write and read optimized storage
• “Projections” for pre-joining and sorting data
• Automatic consistent hashing data
distribution(K-safety level)
• Massively parallel processing
• Partitioning support
52
Links!
• MonetDB -
http://dev.monetdb.org/downloads/sources/Late
st/
• ICE - http://www.infobright.org
• InfiniDB - http://www.calpont.com
• LucidDB – http://luciddb.sourceforge.net
• Fastbit - https://sdm.lbl.gov/fastbit/
• Vectorwise – http://www.vectorwise.com
• Vertica – http://www.vertica.com
53
Percona Training Advantage
– Check out http://training.percona.com for a list of
training events near you
– Request training directly by Justin (me) or any of
our other expert trainers by contacting your
Percona sales rep today
54
55
Q/A

Mais conteúdo relacionado

Mais procurados

8. column oriented databases
8. column oriented databases8. column oriented databases
8. column oriented databasesFabio Fumarola
 
MySQL Performance Tips & Best Practices
MySQL Performance Tips & Best PracticesMySQL Performance Tips & Best Practices
MySQL Performance Tips & Best PracticesIsaac Mosquera
 
Tuning Apache Phoenix/HBase
Tuning Apache Phoenix/HBaseTuning Apache Phoenix/HBase
Tuning Apache Phoenix/HBaseAnil Gupta
 
Parquet and impala overview external
Parquet and impala overview externalParquet and impala overview external
Parquet and impala overview externalmattlieber
 
Sql server 2014 x velocity – updateable columnstore indexes
Sql server 2014 x velocity – updateable columnstore indexesSql server 2014 x velocity – updateable columnstore indexes
Sql server 2014 x velocity – updateable columnstore indexesPat Sheehan
 
Optimizing columnar stores
Optimizing columnar storesOptimizing columnar stores
Optimizing columnar storesIstvan Szukacs
 
Oracle Database In-Memory Option for ILOUG
Oracle Database In-Memory Option for ILOUGOracle Database In-Memory Option for ILOUG
Oracle Database In-Memory Option for ILOUGZohar Elkayam
 
How to Fine-Tune Performance Using Amazon Redshift
How to Fine-Tune Performance Using Amazon RedshiftHow to Fine-Tune Performance Using Amazon Redshift
How to Fine-Tune Performance Using Amazon RedshiftAWS Germany
 
Hive Data Modeling and Query Optimization
Hive Data Modeling and Query OptimizationHive Data Modeling and Query Optimization
Hive Data Modeling and Query OptimizationEyad Garelnabi
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentationMichael Keane
 
Cassandra in production
Cassandra in productionCassandra in production
Cassandra in productionvalstadsve
 
Columnstore indexes in sql server 2014
Columnstore indexes in sql server 2014Columnstore indexes in sql server 2014
Columnstore indexes in sql server 2014Antonios Chatzipavlis
 
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012larsgeorge
 
Cassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating NetflixCassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating NetflixJason Brown
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at AlibabaMichael Stack
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...Cloudera, Inc.
 

Mais procurados (20)

8. column oriented databases
8. column oriented databases8. column oriented databases
8. column oriented databases
 
MySQL Performance Tips & Best Practices
MySQL Performance Tips & Best PracticesMySQL Performance Tips & Best Practices
MySQL Performance Tips & Best Practices
 
Tuning Apache Phoenix/HBase
Tuning Apache Phoenix/HBaseTuning Apache Phoenix/HBase
Tuning Apache Phoenix/HBase
 
Parquet and impala overview external
Parquet and impala overview externalParquet and impala overview external
Parquet and impala overview external
 
Sql server 2014 x velocity – updateable columnstore indexes
Sql server 2014 x velocity – updateable columnstore indexesSql server 2014 x velocity – updateable columnstore indexes
Sql server 2014 x velocity – updateable columnstore indexes
 
Optimizing columnar stores
Optimizing columnar storesOptimizing columnar stores
Optimizing columnar stores
 
Oracle Database In-Memory Option for ILOUG
Oracle Database In-Memory Option for ILOUGOracle Database In-Memory Option for ILOUG
Oracle Database In-Memory Option for ILOUG
 
How to Fine-Tune Performance Using Amazon Redshift
How to Fine-Tune Performance Using Amazon RedshiftHow to Fine-Tune Performance Using Amazon Redshift
How to Fine-Tune Performance Using Amazon Redshift
 
Hive Data Modeling and Query Optimization
Hive Data Modeling and Query OptimizationHive Data Modeling and Query Optimization
Hive Data Modeling and Query Optimization
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentation
 
Cassandra in production
Cassandra in productionCassandra in production
Cassandra in production
 
Columnstore indexes in sql server 2014
Columnstore indexes in sql server 2014Columnstore indexes in sql server 2014
Columnstore indexes in sql server 2014
 
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
 
My sql
My sqlMy sql
My sql
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Cassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating NetflixCassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating Netflix
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
 

Semelhante a Intro to column stores

InnoDB architecture and performance optimization (Пётр Зайцев)
InnoDB architecture and performance optimization (Пётр Зайцев)InnoDB architecture and performance optimization (Пётр Зайцев)
InnoDB architecture and performance optimization (Пётр Зайцев)Ontico
 
SQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - AdvancedSQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - AdvancedTony Rogerson
 
Column Stores and Google BigQuery
Column Stores and Google BigQueryColumn Stores and Google BigQuery
Column Stores and Google BigQueryCsaba Toth
 
Full Table Scan: friend or foe
Full Table Scan: friend or foeFull Table Scan: friend or foe
Full Table Scan: friend or foeMauro Pagano
 
Cache Memory for Computer Architecture.ppt
Cache Memory for Computer Architecture.pptCache Memory for Computer Architecture.ppt
Cache Memory for Computer Architecture.pptrularofclash69
 
Innodb 和 XtraDB 结构和性能优化
Innodb 和 XtraDB 结构和性能优化Innodb 和 XtraDB 结构和性能优化
Innodb 和 XtraDB 结构和性能优化YUCHENG HU
 
InnoDB Architecture and Performance Optimization, Peter Zaitsev
InnoDB Architecture and Performance Optimization, Peter ZaitsevInnoDB Architecture and Performance Optimization, Peter Zaitsev
InnoDB Architecture and Performance Optimization, Peter ZaitsevFuenteovejuna
 
Intro to Exadata
Intro to ExadataIntro to Exadata
Intro to ExadataMoin Khalid
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrRahul Jain
 
Ct213 memory subsystem
Ct213 memory subsystemCt213 memory subsystem
Ct213 memory subsystemSandeep Kamath
 
MySQL Performance Tuning - GNUnify 2010
MySQL Performance Tuning - GNUnify 2010MySQL Performance Tuning - GNUnify 2010
MySQL Performance Tuning - GNUnify 2010OSSCube
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAmazon Web Services
 
Low Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling ExamplesLow Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling ExamplesTanel Poder
 

Semelhante a Intro to column stores (20)

InnoDB architecture and performance optimization (Пётр Зайцев)
InnoDB architecture and performance optimization (Пётр Зайцев)InnoDB architecture and performance optimization (Пётр Зайцев)
InnoDB architecture and performance optimization (Пётр Зайцев)
 
SQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - AdvancedSQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - Advanced
 
Column Stores and Google BigQuery
Column Stores and Google BigQueryColumn Stores and Google BigQuery
Column Stores and Google BigQuery
 
Full Table Scan: friend or foe
Full Table Scan: friend or foeFull Table Scan: friend or foe
Full Table Scan: friend or foe
 
Percona FT / TokuDB
Percona FT / TokuDBPercona FT / TokuDB
Percona FT / TokuDB
 
Cache Memory for Computer Architecture.ppt
Cache Memory for Computer Architecture.pptCache Memory for Computer Architecture.ppt
Cache Memory for Computer Architecture.ppt
 
Innodb 和 XtraDB 结构和性能优化
Innodb 和 XtraDB 结构和性能优化Innodb 和 XtraDB 结构和性能优化
Innodb 和 XtraDB 结构和性能优化
 
InnoDB Architecture and Performance Optimization, Peter Zaitsev
InnoDB Architecture and Performance Optimization, Peter ZaitsevInnoDB Architecture and Performance Optimization, Peter Zaitsev
InnoDB Architecture and Performance Optimization, Peter Zaitsev
 
27 multicore
27 multicore27 multicore
27 multicore
 
27 multicore
27 multicore27 multicore
27 multicore
 
04 cache memory
04 cache memory04 cache memory
04 cache memory
 
Intro to Exadata
Intro to ExadataIntro to Exadata
Intro to Exadata
 
Memory (Computer Organization)
Memory (Computer Organization)Memory (Computer Organization)
Memory (Computer Organization)
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache Solr
 
Ct213 memory subsystem
Ct213 memory subsystemCt213 memory subsystem
Ct213 memory subsystem
 
MySQL Performance Tuning - GNUnify 2010
MySQL Performance Tuning - GNUnify 2010MySQL Performance Tuning - GNUnify 2010
MySQL Performance Tuning - GNUnify 2010
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
 
Low Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling ExamplesLow Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling Examples
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 

Mais de Justin Swanhart

Flexviews materialized views for my sql
Flexviews materialized views for my sqlFlexviews materialized views for my sql
Flexviews materialized views for my sqlJustin Swanhart
 
Shard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackShard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackJustin Swanhart
 
Conquering "big data": An introduction to shard query
Conquering "big data": An introduction to shard queryConquering "big data": An introduction to shard query
Conquering "big data": An introduction to shard queryJustin Swanhart
 
Divide and conquer in the cloud
Divide and conquer in the cloudDivide and conquer in the cloud
Divide and conquer in the cloudJustin Swanhart
 
Summary tables with flexviews
Summary tables with flexviewsSummary tables with flexviews
Summary tables with flexviewsJustin Swanhart
 
Diagnosing MySQL performance problems
Diagnosing  MySQL performance problemsDiagnosing  MySQL performance problems
Diagnosing MySQL performance problemsJustin Swanhart
 

Mais de Justin Swanhart (7)

Flexviews materialized views for my sql
Flexviews materialized views for my sqlFlexviews materialized views for my sql
Flexviews materialized views for my sql
 
Shard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackShard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stack
 
Conquering "big data": An introduction to shard query
Conquering "big data": An introduction to shard queryConquering "big data": An introduction to shard query
Conquering "big data": An introduction to shard query
 
Noinject
NoinjectNoinject
Noinject
 
Divide and conquer in the cloud
Divide and conquer in the cloudDivide and conquer in the cloud
Divide and conquer in the cloud
 
Summary tables with flexviews
Summary tables with flexviewsSummary tables with flexviews
Summary tables with flexviews
 
Diagnosing MySQL performance problems
Diagnosing  MySQL performance problemsDiagnosing  MySQL performance problems
Diagnosing MySQL performance problems
 

Último

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Último (20)

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

Intro to column stores

  • 1. Introduction to column stores Justin Swanhart Percona Live, April 2013
  • 3. Introduction • Who am I? • What do I do? • Why am I here? 3
  • 4. A quick survey 4 ?How many people have heard the term row store?
  • 5. A quick survey 5 ?How many people have heard the term column store?
  • 6. A quick survey 6 ?How many people have heard the term grocery store? (just kidding)
  • 8. What is a row store • You've probably used one  – MyISAM – InnoDB – Memory – TokuDB – Etc 8
  • 9. What is a row store • All MySQL storage engines are row stores – The storage engine API is row oriented – This means the API always stores and retrieves entire rows from disk 9
  • 10. Materialization • Row stores live "in a material world" – A row is a collection of column values (attributes) – A row store stores all the values together on disk • That is, rows are materialized on disk 10
  • 11. Rows are stored in one file or segment on disk 11 id scientist death_by movie_name 1 Reinhardt Maximillian The Black Hole 2 Tyrell Roy Batty Blade Runner 3 Hammond Dinosaur Jurassic Park 4 Soong Lore Star Trek: TNG 5 Morbius His mind Forbidden Planet 6 Dyson Skynet Terminator 2: Judgment Day All columns are stored together on disk.
  • 12. Best case scenario 12 id scientist death_by movie_name 1 Reinhardt Maximillian 2 Tyrell Roy Batty Blade Runner 3 Hammond Dinosaur Jurassic Park 4 Soong Lore Star Trek: TNG 5 Morbius Robby 6 Dyson Skynet Terminator 2: Judgment Day select * from the_table where id = 6 • Performs best when a small number of rows are accessed
  • 13. BUT… • Row stores not so great for wide rows • It can be very bad if only a small subset of the columns are actually used by most queries –Reading the entire row wastes IO 13
  • 14. Row Stores can waste IO 14 6 Dyson Skynet Terminator 2: Judgment Day select movie_name from the_table where id = 6 Terminator 2: Judgment Day 1. Storage engine gets whole row*: 2. SQL interface extracts only requested portion 3. Results returned to client * By primary key in this case
  • 15. Worst case scenario • select sum(bigint_column) from table – 1000000 rows in table – Average row length 1024 bytes • The select reads one bigint column (8 bytes) – But the entire row must be read to get one column – Reads ~1GB of table data for ~8MB of column data 15
  • 17. Non-material world • In a column stores rows are not materialized during storage – An entire row image exists only in memory – Each row still has some sort of "row id" 17
  • 18. What is a row? • A row is a collection of column values that are associated with one another • Associated? – Every row has some type of "row id" • PK column or GEN_CLUST_INDEX on InnoDB • Physical offset in a MyISAM table • In some databases it is just a virtual identifier for a row 18
  • 19. Column store stores each COLUMN on disk 19 Genre Comedy Comedy Horror Drama Comedy Drama id 1 3 4 2 5 6 Title Mrs. Doubtfire The Big Lebowski The Fly Steel Magnolias The Birdcage Erin Brokovitch Actor Robin Williams Jeff Bridges Jeff Goldblum Dolly Parton Nathan Lane Julia Roberts row id = 1 row id = 6 Each column has a file or segment on diskNatural order may be unusual
  • 20. Column store benefit: Compression • Almost all column stores perform compression – Compression further reduces the storage footprint of each column – Column data type tailored compression • RLE • Integer packing • Dictionary and lookup string compression • Other (depends on column store) 20
  • 21. Column store benefit: Compression • Effective compression reduces storage cost • IO reduction yields decreased response times during queries as well – Queries may execute an order of magnitude faster compared to queries over the same data set on a row store • 10:1 to 30:1 compression rations may be seen 21
  • 22. Best case scenario (IO savings) • select sum(bigint_column) from table – 1000000 rows in table – Average row length 1024 bytes • The select reads one bigint column (8 bytes) – Only the single column is read from disk – Reads ~8MB of column data instead of 1GB of table data – Could read even less with compression! 22
  • 23. Less than ideal scenario* select * from long_wide_table where order_line_id = 321837291; 23 All columns Single row * Possibly worst case. We'll see in the performance test as this varies by column store.
  • 24. Column store downsides: • Accessing all columns in a table doesn't save anything – It could even be more expensive than a row store – Not ideal for tables with few columns 24
  • 25. More downsides • Updating and deleting rows is expensive – Some column stores are append only – Others strongly discourage writes – Some column stores split storage into row and column areas (in Vertica* this is called WOS/ROS) 25 *Not open source, based on C-Store an MIT project
  • 27. Analytics versus OLTP • Row stores are for OLTP – Reading small portions of a table, but often many of the columns – Frequent changes to data – Small* amount of data (typically working set must fit in ram) – "Nested loops" joins are well optimized for OLTP 27 * < 2TB data sets in general
  • 28. Analytics versus OLTP • Column stores are designed for OLAP – Read large portions of a table in terms of rows, but often a small number of columns – Batch loading / updates – Big data*! • Effective compression and reduced query IO make column stores much more attractive for structured big data • Machine generated data is especially well suited 28 * 50TB – 100TB per machine is typically possible, before compression
  • 29. Not all column stores are for big data! • In-memory analytics is popular – Some column stores are designed to be used as an in-memory store – It is possible to build very large SMP clusters which scale these databases to many machines but this outside of the scope of this talk 29
  • 30. Indexing • Row stores mainly use tree style indexes – When you create an index on a row store you usually create a "B" tree index – These indexes leverage binary search – Very fast as long as the index fits in memory 30
  • 31. Tree indexes • A read is necessary to satisfy a write – Even with fast IO this is expensive – This means that tree indexes work best for in- memory data sets – Very large data sets usually end up with indexes which are unmanageably large 31
  • 32. Use less trees: Bitmap indexing • Some column stores support bitmap indexes – Columnar storage lends very well to bitmaps – Bitmap indexes are effective for Boolean searches over multiple columns in a table 32
  • 33. Use less trees: Bitmap indexing • Unfortunately – Bitmap indexes are not supported by any MySQL column store – Very expensive to update 33
  • 34. OPEN SOURCE COLUMN STORES 34
  • 35. Fastbit bitmap store • Academic column store – Simple text interface – Allows bitmap indexes to be compared whilst still compressed – No joins – Append-only* data store 35 * records can be marked as deleted with library calls, but not removed from disk https://sdm.lbl.gov/fastbit/
  • 36. MySQL column stores - ICE • Infobright Community Edition (ICE) – Some data type limitations – Single threaded loader – Single threaded query – No temp tables, no RENAME table, no DDL – LOAD DATA only (append only, no DML) 36
  • 37. MySQL column stores – ICE [2] • Data stored in "data packs" • No indexes – Knowledge grid stores statistics and information about the data packs • Decides which packs must be decompressed • Can make very intelligent query decisions • Hash join only • No constraints 37
  • 38. MySQL column stores – InfiniDB Community • InfiniDB community edition – Some data type limitations – Single threaded query* – Single threaded loader* – No indexes – Wrong results in testing – Hash join only 38 * You are required to bind infinidb to only a single core
  • 39. LucidDB • Java based column store database • SQL/MED data wrappers • Built-in data versioning • Commercial backing ended in October, 2012 – But source code still available under Apache license 39
  • 40. LucidDB [2] • Bitmap and tree based indexes – Support for foreign key, primary key and unique constraints • ANSI SQL support • Multiple join algorithms • Single-threaded queries 40
  • 41. MonetDB • Early (started circa 2004) column store • Open source • Designed for in-memory working sets – Indexes must fit in memory – Swap size must be large enough for working set if working set exceeds memory • Some wrong results during testing 41
  • 42. MonetDB [2] • Data is stored compressed on disk • In memory structure is array-like – Parallel query execution – CPU cache optimizations – multiple join algorithms • Primary key, foreign key, unique and NOT NULL constraints 42
  • 44. Loading Performance (InfiniDB omitted) 44 CUST DATE PART SUPP LINE ALL TABLES MonetDB 0.83 0.26 1.980 0.23 125 128s ICE 1.69 0.09 2.372 0.15 128 133s LucidDB 2.49 0.31 5.915 0.96 297 307s InnoDB 3.11 0.14 11.83 0.53 300 316s Line Cust PartDate Supp Star Schema Benchmark Scale Factor = 5 Lineorder rows = 29,999,795
  • 45. Point Lookup Query select * from lineorder where LO_OrderKey = 10000001 and LO_LineNumber = 3; cold hot MonetDB 0.081 0.001 ICE 0.29 0.07 LucidDB 2.4 0.9 InnoDB 0.01 0.001 45
  • 46. Simple Aggregate Query select sum(LO_OrdTotalPrice) from lineorder; cold hot ICE 0.02 0.001 MonetDB 0.2 0.07 LucidDB 5.9 3.9 InnoDB 34 9.3 46
  • 47. Complex Query with no join select LO_CustKey, sum(LO_OrdTotalPrice) from lineorder group by LO_CustKey having sum(LO_OrdTotalPrice) >= 11349901235 order by sum(LO_OrdTotalPrice) desc; cold hot MonetDB 3.9 1.5 ICE 7.19 3.14 LucidDB 17.5 13.3 InnoDB 41.18 33.81 47
  • 48. Complex Query with join, group by and filter cold hot ICE 6.9 1.83 LucidDB 17.3 15.3 InnoDB 58.23 51.62 MonetDB ERR ERR 48 Wrong results select d_year, p_brand, sum(lo_revenue) from lineorder join dim_date on lo_orderdate = d_datekey join part on lo_partkey = p_partkey join supplier on lo_suppkey = s_suppkey where p_category = 'MFGR#12' and s_region = 'AMERICA' group by d_year, p_brand order by d_year, p_brand;
  • 49. CLOSED SOURCE COLUMN STORES 49
  • 50. Infobright Enterprise Edition – IEE – Supports most MySQL SQL – Parallel loader / binary loader – Supports DML (insert, update, delete) – Supports DDL (rename, truncate) – Supports temp tables – Parallel query – Support 50
  • 51. Vectorwise • In memory SMP column store • Vector execution architecture for parallel pipelined MPP query execution • Very fast • Uses ScaleMP to scale to more than one machine 51
  • 52. Vertica • An HP company • Originally C-Store • Write and read optimized storage • “Projections” for pre-joining and sorting data • Automatic consistent hashing data distribution(K-safety level) • Massively parallel processing • Partitioning support 52
  • 53. Links! • MonetDB - http://dev.monetdb.org/downloads/sources/Late st/ • ICE - http://www.infobright.org • InfiniDB - http://www.calpont.com • LucidDB – http://luciddb.sourceforge.net • Fastbit - https://sdm.lbl.gov/fastbit/ • Vectorwise – http://www.vectorwise.com • Vertica – http://www.vertica.com 53
  • 54. Percona Training Advantage – Check out http://training.percona.com for a list of training events near you – Request training directly by Justin (me) or any of our other expert trainers by contacting your Percona sales rep today 54