SlideShare uma empresa Scribd logo
1 de 22
Baixar para ler offline
Eliminating the Database
Bottleneck
What makes Vectorwise so fast


Mark Van de Wiel
Director Product Management, Vectorwise

Thursday, November 01, 2012



1 of 9 1 of 9
Confidential © 2012 Actian Corporation
Agenda

 Why traditional RDBMSs are slow for analytics
 Why Vectorwise is fast
 The I/O challenge
 Efficient updates




          Confidential © 2012 Actian Corporation   2
100x (+) Performance Difference – 2003
Custom C versus Relational Database
                                           TPC-H 1 GB query 1
                                             (runtime in s)
30                                28.1
     26.2
25
20                                                                           MySQL
15                                                                           DBMS 'X'
                                                                             C program
10
                                                                             Vectorwise
 5
                                                     0.2           0.6
 0
     MySQL                    DBMS 'X'            C program     Vectorwise



        Confidential © 2012 Actian Corporation                                    3
Traditional Relational Database for Analytics
Inefficiencies
 Inefficient storage
 Inefficient processing




           Confidential © 2012 Actian Corporation   4
Inefficient Storage for Analytics

 Row-based storage model
  Predominant in 2003, still very common today
  Works well for OLTP


      101      Joe                           27      Black

      103      Edward                        21      Scissorhand




            Confidential © 2012 Actian Corporation                 5
Inefficient Storage – Row-based

 Pages on disk – example


         101                          27                Joe Black

                     103                           21       Edward Scissorhand
                                                         Var-width attribute pointers



                                                                       pointers to tuples




          Confidential © 2012 Actian Corporation                                            6
Issues with Row-based Storage

 Always read all attributes
  Poor bandwidth
  Poor use of memory buffer

 Complex row structure and navigation
  E.g. compressing out null fields
  E.g. row chaining




             Confidential © 2012 Actian Corporation   7
Efficient Storage for Analytics

 Columnar storage: store attributes separtely
 Retrieve only attributes required by the query
 Used by “traditional” column stores, e.g. Sybase IQ, Vertica




          Confidential © 2012 Actian Corporation                8
Inefficient Processing

How a traditional database runs a query


                                                   Query:

                                                   SELECT
                                                       name,
                                                       salary*.19 AS tax
                                                   FROM
                                                       employee
                                                   WHERE
                                                       age > 25




          Confidential © 2012 Actian Corporation                           9
Inefficient Processing

How a traditional database runs a query

                                                   Tuple-at-a-time iterator interface:
                                                   - open()
                                                   - next(): tuple
                                                   - close()


                                                   next() is called:
                                                   - for each operator
                                                   - for each tuple


                                                   Complex code repeated over and over


          Confidential © 2012 Actian Corporation                                     10
Inefficient Processing

How a traditional database runs a query

                                                   Data-specific computational
                                                   functionality

                                                   Called once for every operation
                                                   on every tuple

                                                   Worse for complex tuple
                                                   representations




          Confidential © 2012 Actian Corporation                                     11
Inefficient Processing (Part 1 of 2)

 Lots of repeated, unnecessary code
  Operator logic
  Function calls
  Attribute access

 Most instructions interpreting a query
 Very few instructions processing actual data!
 Many instructions per tuple




             Confidential © 2012 Actian Corporation   12
CPU Features – Inefficient Processing Part 2

 In the last 20 years…
  Chip cache because RAM access is too slow and congested
  Branch-sensitive CPU pipelines
  Superscalar features
  SIMD instructions (SSE and AVX)

 Great for multimedia processing, scientific computing…
 … but NOT for traditional relational databases
  Complex code: function calls, branches
  Poor use of CPU cache (both data and instructions)
  Processing one value at a time




            Confidential © 2012 Actian Corporation          13
Inefficient Processing

Traditional RDBMS
 Many instructions per tuple
 Many cycles per instruction
 Very many cycles per tuple




          Confidential © 2012 Actian Corporation   14
Vectorwise – Vector-based Processing



                                                Query:

                                                SELECT
                                                    name,
                                                    salary*.19 AS tax
                                                FROM
                                                    employee
                                                WHERE
                                                    age > 25




       Confidential © 2012 Actian Corporation                           15
Vectorwise – Vector-based Processing

                                                Vector contains data of
                                                multiple tuples (1024)

                                                All operations consume
                                                and produce entire vectors

                                                Effect: much less
                                                operator.next() and
                                                primitive calls.

                                                AND: pipelined query
                                                evaluation

       Confidential © 2012 Actian Corporation                          16
Why is Vectorwise so Fast?

 Reduced interpretation overhead
  100+ times fewer function calls
 Good CPU cache use
  High locality in primitives
  Cache-conscious algorithms
 No tuple navigation
  Primitives only see arrays
 Vectorization allows algorithmic optimization
 CPU and compiler-friendly function bodies
  Multiple work units, loop-pipelining, SIMD…
 BONUS: PARALLEL QUERY



              Confidential © 2012 Actian Corporation   17
Some Numbers

 Traditional RDBMS: <200 MB/s per core
 Vectorwise (lab environment): >1.5 GB/s per core




          Confidential © 2012 Actian Corporation    18
Addressing the I/O Challenge

 Columnar storage
 Smart column buffer (memory)
 Data compression
  On disk: less I/O
  In memory: best use of column buffer
  Ultra-efficient decompression algorithms to
  get sufficient throughput

 Large contiguous data blocks
 for optimum disk I/O
 In-memory min-max indexes per block (i.e. per column)
  Eliminate data blocks based on implicit/explicit filter criteria



              Confidential © 2012 Actian Corporation                 19
Efficient Updates in a Column Store

Positional Delta Trees (PDTs)
 In-memory representation of small data changes
  Efficiently merged with on-disk data
  Periodically propagated to disk

 Provide snapshot read consistency
 ACID compliant




             Confidential © 2012 Actian Corporation   20
Agenda

 Why traditional RDBMSs are slow for analytics
 Why Vectorwise is fast
 The I/O challenge
 Efficient updates




          Confidential © 2012 Actian Corporation   21
Confidential © 2012 Actian Corporation

Mais conteúdo relacionado

Mais procurados

Diagnosability versus The Cloud, Redwood Shores 2011-08-30
Diagnosability versus The Cloud, Redwood Shores 2011-08-30Diagnosability versus The Cloud, Redwood Shores 2011-08-30
Diagnosability versus The Cloud, Redwood Shores 2011-08-30Cary Millsap
 
[Uruguay] IBM Systems Director Navigator for i
[Uruguay] IBM Systems Director Navigator for i[Uruguay] IBM Systems Director Navigator for i
[Uruguay] IBM Systems Director Navigator for iIBMSSA
 
Oracle Systems _ Jeff Schwartz _ Engineering Solutions Exadata - Exalogic.pdf
Oracle Systems _ Jeff Schwartz _ Engineering Solutions Exadata - Exalogic.pdfOracle Systems _ Jeff Schwartz _ Engineering Solutions Exadata - Exalogic.pdf
Oracle Systems _ Jeff Schwartz _ Engineering Solutions Exadata - Exalogic.pdfInSync2011
 
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?  Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You? EMC
 
VMworld 2012 - Spotlight Session - EMC Transforms IT - Jeremy Burton
VMworld 2012 - Spotlight Session - EMC Transforms IT - Jeremy BurtonVMworld 2012 - Spotlight Session - EMC Transforms IT - Jeremy Burton
VMworld 2012 - Spotlight Session - EMC Transforms IT - Jeremy BurtonEMCTechMktg
 
JDE & Peoplesoft 1 _ Roland Slee & Doug Hughes _ Oracle's Cloud Computing Str...
JDE & Peoplesoft 1 _ Roland Slee & Doug Hughes _ Oracle's Cloud Computing Str...JDE & Peoplesoft 1 _ Roland Slee & Doug Hughes _ Oracle's Cloud Computing Str...
JDE & Peoplesoft 1 _ Roland Slee & Doug Hughes _ Oracle's Cloud Computing Str...InSync2011
 
The non stop mission critical experience
The non stop mission critical experienceThe non stop mission critical experience
The non stop mission critical experienceHP ESSN Philippines
 
Application Grid: Platform for Virtualization and Consolidation of your Java ...
Application Grid: Platform for Virtualization and Consolidation of your Java ...Application Grid: Platform for Virtualization and Consolidation of your Java ...
Application Grid: Platform for Virtualization and Consolidation of your Java ...Bob Rhubart
 
Transform Microsoft Application Environment With EMC Information Infrastructure
Transform Microsoft Application Environment With EMC Information InfrastructureTransform Microsoft Application Environment With EMC Information Infrastructure
Transform Microsoft Application Environment With EMC Information InfrastructureEMC Forum India
 
Collaborate07kmohiuddin
Collaborate07kmohiuddinCollaborate07kmohiuddin
Collaborate07kmohiuddinSal Marcus
 
Limewood Event - EMC
Limewood Event - EMC Limewood Event - EMC
Limewood Event - EMC BlueChipICT
 
Do More with Oracle Environment with Open and Best of breed Technologies
Do More with Oracle Environment with Open and Best of breed TechnologiesDo More with Oracle Environment with Open and Best of breed Technologies
Do More with Oracle Environment with Open and Best of breed TechnologiesEMC Forum India
 
Architecting for a cost effective Windows Azure solution
Architecting for a cost effective Windows Azure solutionArchitecting for a cost effective Windows Azure solution
Architecting for a cost effective Windows Azure solutionMaarten Balliauw
 
Open systems Specialists: XiV Storage Reinvented
Open systems Specialists: XiV Storage ReinventedOpen systems Specialists: XiV Storage Reinvented
Open systems Specialists: XiV Storage ReinventedVincent Kwon
 
Case study 1
Case study 1Case study 1
Case study 1systemz
 
Converged infrastructure ucc
Converged infrastructure  uccConverged infrastructure  ucc
Converged infrastructure ucctamar1981
 
Improve DB2 z/OS Test Data Management
Improve DB2 z/OS Test Data ManagementImprove DB2 z/OS Test Data Management
Improve DB2 z/OS Test Data Managementsoftbasemarketing
 

Mais procurados (20)

Diagnosability versus The Cloud, Redwood Shores 2011-08-30
Diagnosability versus The Cloud, Redwood Shores 2011-08-30Diagnosability versus The Cloud, Redwood Shores 2011-08-30
Diagnosability versus The Cloud, Redwood Shores 2011-08-30
 
[Uruguay] IBM Systems Director Navigator for i
[Uruguay] IBM Systems Director Navigator for i[Uruguay] IBM Systems Director Navigator for i
[Uruguay] IBM Systems Director Navigator for i
 
Oracle Systems _ Jeff Schwartz _ Engineering Solutions Exadata - Exalogic.pdf
Oracle Systems _ Jeff Schwartz _ Engineering Solutions Exadata - Exalogic.pdfOracle Systems _ Jeff Schwartz _ Engineering Solutions Exadata - Exalogic.pdf
Oracle Systems _ Jeff Schwartz _ Engineering Solutions Exadata - Exalogic.pdf
 
Oow Ppt 1
Oow Ppt 1Oow Ppt 1
Oow Ppt 1
 
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?  Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
 
VMworld 2012 - Spotlight Session - EMC Transforms IT - Jeremy Burton
VMworld 2012 - Spotlight Session - EMC Transforms IT - Jeremy BurtonVMworld 2012 - Spotlight Session - EMC Transforms IT - Jeremy Burton
VMworld 2012 - Spotlight Session - EMC Transforms IT - Jeremy Burton
 
JDE & Peoplesoft 1 _ Roland Slee & Doug Hughes _ Oracle's Cloud Computing Str...
JDE & Peoplesoft 1 _ Roland Slee & Doug Hughes _ Oracle's Cloud Computing Str...JDE & Peoplesoft 1 _ Roland Slee & Doug Hughes _ Oracle's Cloud Computing Str...
JDE & Peoplesoft 1 _ Roland Slee & Doug Hughes _ Oracle's Cloud Computing Str...
 
Introducing VNX Series
Introducing VNX SeriesIntroducing VNX Series
Introducing VNX Series
 
The non stop mission critical experience
The non stop mission critical experienceThe non stop mission critical experience
The non stop mission critical experience
 
Application Grid: Platform for Virtualization and Consolidation of your Java ...
Application Grid: Platform for Virtualization and Consolidation of your Java ...Application Grid: Platform for Virtualization and Consolidation of your Java ...
Application Grid: Platform for Virtualization and Consolidation of your Java ...
 
Transform Microsoft Application Environment With EMC Information Infrastructure
Transform Microsoft Application Environment With EMC Information InfrastructureTransform Microsoft Application Environment With EMC Information Infrastructure
Transform Microsoft Application Environment With EMC Information Infrastructure
 
Collaborate07kmohiuddin
Collaborate07kmohiuddinCollaborate07kmohiuddin
Collaborate07kmohiuddin
 
Limewood Event - EMC
Limewood Event - EMC Limewood Event - EMC
Limewood Event - EMC
 
Do More with Oracle Environment with Open and Best of breed Technologies
Do More with Oracle Environment with Open and Best of breed TechnologiesDo More with Oracle Environment with Open and Best of breed Technologies
Do More with Oracle Environment with Open and Best of breed Technologies
 
Oracle ksplice
Oracle kspliceOracle ksplice
Oracle ksplice
 
Architecting for a cost effective Windows Azure solution
Architecting for a cost effective Windows Azure solutionArchitecting for a cost effective Windows Azure solution
Architecting for a cost effective Windows Azure solution
 
Open systems Specialists: XiV Storage Reinvented
Open systems Specialists: XiV Storage ReinventedOpen systems Specialists: XiV Storage Reinvented
Open systems Specialists: XiV Storage Reinvented
 
Case study 1
Case study 1Case study 1
Case study 1
 
Converged infrastructure ucc
Converged infrastructure  uccConverged infrastructure  ucc
Converged infrastructure ucc
 
Improve DB2 z/OS Test Data Management
Improve DB2 z/OS Test Data ManagementImprove DB2 z/OS Test Data Management
Improve DB2 z/OS Test Data Management
 

Destaque

Performance Bottleneck Identification through Software Diagnostics- Impetus W...
Performance Bottleneck Identification through Software Diagnostics- Impetus W...Performance Bottleneck Identification through Software Diagnostics- Impetus W...
Performance Bottleneck Identification through Software Diagnostics- Impetus W...Impetus Technologies
 
Find and Fix Performance Bottlenecks with New Relic and BlazeMeter
Find and Fix Performance Bottlenecks with New Relic and BlazeMeter Find and Fix Performance Bottlenecks with New Relic and BlazeMeter
Find and Fix Performance Bottlenecks with New Relic and BlazeMeter Alon Girmonsky
 
Bottlenecks exposed web app db servers
Bottlenecks exposed web app db serversBottlenecks exposed web app db servers
Bottlenecks exposed web app db serversUpender Dravidum
 
How to Run a 1,000,000 VU Load Test using Apache JMeter and BlazeMeter
How to Run a 1,000,000 VU Load Test using Apache JMeter and BlazeMeterHow to Run a 1,000,000 VU Load Test using Apache JMeter and BlazeMeter
How to Run a 1,000,000 VU Load Test using Apache JMeter and BlazeMeterAlon Girmonsky
 
20161213_FinTech時代に求められるDB開発とセキュリティ by 株式会社インサイトテクノロジー 阿部健一
20161213_FinTech時代に求められるDB開発とセキュリティ by 株式会社インサイトテクノロジー 阿部健一20161213_FinTech時代に求められるDB開発とセキュリティ by 株式会社インサイトテクノロジー 阿部健一
20161213_FinTech時代に求められるDB開発とセキュリティ by 株式会社インサイトテクノロジー 阿部健一Insight Technology, Inc.
 

Destaque (6)

Performance Bottleneck Identification through Software Diagnostics- Impetus W...
Performance Bottleneck Identification through Software Diagnostics- Impetus W...Performance Bottleneck Identification through Software Diagnostics- Impetus W...
Performance Bottleneck Identification through Software Diagnostics- Impetus W...
 
Find and Fix Performance Bottlenecks with New Relic and BlazeMeter
Find and Fix Performance Bottlenecks with New Relic and BlazeMeter Find and Fix Performance Bottlenecks with New Relic and BlazeMeter
Find and Fix Performance Bottlenecks with New Relic and BlazeMeter
 
Bottlenecks exposed web app db servers
Bottlenecks exposed web app db serversBottlenecks exposed web app db servers
Bottlenecks exposed web app db servers
 
How to Run a 1,000,000 VU Load Test using Apache JMeter and BlazeMeter
How to Run a 1,000,000 VU Load Test using Apache JMeter and BlazeMeterHow to Run a 1,000,000 VU Load Test using Apache JMeter and BlazeMeter
How to Run a 1,000,000 VU Load Test using Apache JMeter and BlazeMeter
 
Database - Design & Implementation - 1
Database - Design & Implementation - 1Database - Design & Implementation - 1
Database - Design & Implementation - 1
 
20161213_FinTech時代に求められるDB開発とセキュリティ by 株式会社インサイトテクノロジー 阿部健一
20161213_FinTech時代に求められるDB開発とセキュリティ by 株式会社インサイトテクノロジー 阿部健一20161213_FinTech時代に求められるDB開発とセキュリティ by 株式会社インサイトテクノロジー 阿部健一
20161213_FinTech時代に求められるDB開発とセキュリティ by 株式会社インサイトテクノロジー 阿部健一
 

Semelhante a B17 Eliminating the database bottleneck

A27 Vectorwise Performance Considerations_implementation_best_practices
A27 Vectorwise Performance Considerations_implementation_best_practicesA27 Vectorwise Performance Considerations_implementation_best_practices
A27 Vectorwise Performance Considerations_implementation_best_practicesInsight Technology, Inc.
 
A14 Getting Started with Vectorwise by Mark Van de Wiel
A14 Getting Started with Vectorwise by Mark Van de WielA14 Getting Started with Vectorwise by Mark Van de Wiel
A14 Getting Started with Vectorwise by Mark Van de WielInsight Technology, Inc.
 
Atea roadshow norr
Atea roadshow norrAtea roadshow norr
Atea roadshow norrJohan Odell
 
OpenStack Summit Portland April 2013 talk - Quantum and EC2
OpenStack Summit Portland April 2013 talk - Quantum and EC2OpenStack Summit Portland April 2013 talk - Quantum and EC2
OpenStack Summit Portland April 2013 talk - Quantum and EC2Naveen Joy
 
Breakthrough performance with MySQL Cluster (2012)
Breakthrough performance with MySQL Cluster (2012)Breakthrough performance with MySQL Cluster (2012)
Breakthrough performance with MySQL Cluster (2012)Frazer Clement
 
Informix Update New Features 11.70.xC1+
Informix Update New Features 11.70.xC1+Informix Update New Features 11.70.xC1+
Informix Update New Features 11.70.xC1+IBM Sverige
 
Open world exadata_top_10_lessons_learned
Open world exadata_top_10_lessons_learnedOpen world exadata_top_10_lessons_learned
Open world exadata_top_10_lessons_learnedchet justice
 
Oracle Database Appliance - Introduction in Cyprus
Oracle Database Appliance - Introduction in CyprusOracle Database Appliance - Introduction in Cyprus
Oracle Database Appliance - Introduction in CyprusAndy Panayiotou
 
Accelerating big data with ioMemory and Cisco UCS and NOSQL
Accelerating big data with ioMemory and Cisco UCS and NOSQLAccelerating big data with ioMemory and Cisco UCS and NOSQL
Accelerating big data with ioMemory and Cisco UCS and NOSQLSumeet Bansal
 
Pro sphere customer technical
Pro sphere customer technicalPro sphere customer technical
Pro sphere customer technicalsolarisyougood
 
Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10keirdo1
 
Brocade: Storage Networking For the Virtual Enterprise
Brocade: Storage Networking For the Virtual Enterprise Brocade: Storage Networking For the Virtual Enterprise
Brocade: Storage Networking For the Virtual Enterprise EMC
 
Implementing and Troubleshooting EdgeSight
Implementing and Troubleshooting EdgeSightImplementing and Troubleshooting EdgeSight
Implementing and Troubleshooting EdgeSightDavid McGeough
 
Presentación Data Center Cablevisión Day 2010
Presentación Data Center Cablevisión Day 2010Presentación Data Center Cablevisión Day 2010
Presentación Data Center Cablevisión Day 2010Logicalis Latam
 
Ugif 04 2011 france ug04042011-jroy_part1
Ugif 04 2011   france ug04042011-jroy_part1Ugif 04 2011   france ug04042011-jroy_part1
Ugif 04 2011 france ug04042011-jroy_part1UGIF
 
MT49 Dell EMC XtremIO: Product Overview and New Use Cases
MT49 Dell EMC XtremIO: Product Overview and New Use CasesMT49 Dell EMC XtremIO: Product Overview and New Use Cases
MT49 Dell EMC XtremIO: Product Overview and New Use CasesDell EMC World
 

Semelhante a B17 Eliminating the database bottleneck (20)

A27 Vectorwise Performance Considerations_implementation_best_practices
A27 Vectorwise Performance Considerations_implementation_best_practicesA27 Vectorwise Performance Considerations_implementation_best_practices
A27 Vectorwise Performance Considerations_implementation_best_practices
 
A14 Getting Started with Vectorwise by Mark Van de Wiel
A14 Getting Started with Vectorwise by Mark Van de WielA14 Getting Started with Vectorwise by Mark Van de Wiel
A14 Getting Started with Vectorwise by Mark Van de Wiel
 
Atea roadshow norr
Atea roadshow norrAtea roadshow norr
Atea roadshow norr
 
Back to The Future V
Back to The Future VBack to The Future V
Back to The Future V
 
OpenStack Summit Portland April 2013 talk - Quantum and EC2
OpenStack Summit Portland April 2013 talk - Quantum and EC2OpenStack Summit Portland April 2013 talk - Quantum and EC2
OpenStack Summit Portland April 2013 talk - Quantum and EC2
 
Breakthrough performance with MySQL Cluster (2012)
Breakthrough performance with MySQL Cluster (2012)Breakthrough performance with MySQL Cluster (2012)
Breakthrough performance with MySQL Cluster (2012)
 
Antonio piraino v1
Antonio piraino v1Antonio piraino v1
Antonio piraino v1
 
Informix Update New Features 11.70.xC1+
Informix Update New Features 11.70.xC1+Informix Update New Features 11.70.xC1+
Informix Update New Features 11.70.xC1+
 
Tame that Beast
Tame that BeastTame that Beast
Tame that Beast
 
Open world exadata_top_10_lessons_learned
Open world exadata_top_10_lessons_learnedOpen world exadata_top_10_lessons_learned
Open world exadata_top_10_lessons_learned
 
Oracle Database Appliance - Introduction in Cyprus
Oracle Database Appliance - Introduction in CyprusOracle Database Appliance - Introduction in Cyprus
Oracle Database Appliance - Introduction in Cyprus
 
Accelerating big data with ioMemory and Cisco UCS and NOSQL
Accelerating big data with ioMemory and Cisco UCS and NOSQLAccelerating big data with ioMemory and Cisco UCS and NOSQL
Accelerating big data with ioMemory and Cisco UCS and NOSQL
 
Pro sphere customer technical
Pro sphere customer technicalPro sphere customer technical
Pro sphere customer technical
 
Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10
 
Brocade: Storage Networking For the Virtual Enterprise
Brocade: Storage Networking For the Virtual Enterprise Brocade: Storage Networking For the Virtual Enterprise
Brocade: Storage Networking For the Virtual Enterprise
 
Implementing and Troubleshooting EdgeSight
Implementing and Troubleshooting EdgeSightImplementing and Troubleshooting EdgeSight
Implementing and Troubleshooting EdgeSight
 
VMware Solutions
VMware SolutionsVMware Solutions
VMware Solutions
 
Presentación Data Center Cablevisión Day 2010
Presentación Data Center Cablevisión Day 2010Presentación Data Center Cablevisión Day 2010
Presentación Data Center Cablevisión Day 2010
 
Ugif 04 2011 france ug04042011-jroy_part1
Ugif 04 2011   france ug04042011-jroy_part1Ugif 04 2011   france ug04042011-jroy_part1
Ugif 04 2011 france ug04042011-jroy_part1
 
MT49 Dell EMC XtremIO: Product Overview and New Use Cases
MT49 Dell EMC XtremIO: Product Overview and New Use CasesMT49 Dell EMC XtremIO: Product Overview and New Use Cases
MT49 Dell EMC XtremIO: Product Overview and New Use Cases
 

Mais de Insight Technology, Inc.

グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?Insight Technology, Inc.
 
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~Insight Technology, Inc.
 
事例を通じて機械学習とは何かを説明する
事例を通じて機械学習とは何かを説明する事例を通じて機械学習とは何かを説明する
事例を通じて機械学習とは何かを説明するInsight Technology, Inc.
 
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーンInsight Technology, Inc.
 
MBAAで覚えるDBREの大事なおしごと
MBAAで覚えるDBREの大事なおしごとMBAAで覚えるDBREの大事なおしごと
MBAAで覚えるDBREの大事なおしごとInsight Technology, Inc.
 
グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?Insight Technology, Inc.
 
DBREから始めるデータベースプラットフォーム
DBREから始めるデータベースプラットフォームDBREから始めるデータベースプラットフォーム
DBREから始めるデータベースプラットフォームInsight Technology, Inc.
 
SQL Server エンジニアのためのコンテナ入門
SQL Server エンジニアのためのコンテナ入門SQL Server エンジニアのためのコンテナ入門
SQL Server エンジニアのためのコンテナ入門Insight Technology, Inc.
 
db tech showcase2019オープニングセッション @ 森田 俊哉
db tech showcase2019オープニングセッション @ 森田 俊哉 db tech showcase2019オープニングセッション @ 森田 俊哉
db tech showcase2019オープニングセッション @ 森田 俊哉 Insight Technology, Inc.
 
db tech showcase2019 オープニングセッション @ 石川 雅也
db tech showcase2019 オープニングセッション @ 石川 雅也db tech showcase2019 オープニングセッション @ 石川 雅也
db tech showcase2019 オープニングセッション @ 石川 雅也Insight Technology, Inc.
 
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー Insight Technology, Inc.
 
難しいアプリケーション移行、手軽に試してみませんか?
難しいアプリケーション移行、手軽に試してみませんか?難しいアプリケーション移行、手軽に試してみませんか?
難しいアプリケーション移行、手軽に試してみませんか?Insight Technology, Inc.
 
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介Insight Technology, Inc.
 
そのデータベース、クラウドで使ってみませんか?
そのデータベース、クラウドで使ってみませんか?そのデータベース、クラウドで使ってみませんか?
そのデータベース、クラウドで使ってみませんか?Insight Technology, Inc.
 
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...Insight Technology, Inc.
 
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。 複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。 Insight Technology, Inc.
 
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...Insight Technology, Inc.
 
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]Insight Technology, Inc.
 

Mais de Insight Technology, Inc. (20)

グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?
 
Docker and the Oracle Database
Docker and the Oracle DatabaseDocker and the Oracle Database
Docker and the Oracle Database
 
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
 
事例を通じて機械学習とは何かを説明する
事例を通じて機械学習とは何かを説明する事例を通じて機械学習とは何かを説明する
事例を通じて機械学習とは何かを説明する
 
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
仮想通貨ウォレットアプリで理解するデータストアとしてのブロックチェーン
 
MBAAで覚えるDBREの大事なおしごと
MBAAで覚えるDBREの大事なおしごとMBAAで覚えるDBREの大事なおしごと
MBAAで覚えるDBREの大事なおしごと
 
グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?グラフデータベースは如何に自然言語を理解するか?
グラフデータベースは如何に自然言語を理解するか?
 
DBREから始めるデータベースプラットフォーム
DBREから始めるデータベースプラットフォームDBREから始めるデータベースプラットフォーム
DBREから始めるデータベースプラットフォーム
 
SQL Server エンジニアのためのコンテナ入門
SQL Server エンジニアのためのコンテナ入門SQL Server エンジニアのためのコンテナ入門
SQL Server エンジニアのためのコンテナ入門
 
Lunch & Learn, AWS NoSQL Services
Lunch & Learn, AWS NoSQL ServicesLunch & Learn, AWS NoSQL Services
Lunch & Learn, AWS NoSQL Services
 
db tech showcase2019オープニングセッション @ 森田 俊哉
db tech showcase2019オープニングセッション @ 森田 俊哉 db tech showcase2019オープニングセッション @ 森田 俊哉
db tech showcase2019オープニングセッション @ 森田 俊哉
 
db tech showcase2019 オープニングセッション @ 石川 雅也
db tech showcase2019 オープニングセッション @ 石川 雅也db tech showcase2019 オープニングセッション @ 石川 雅也
db tech showcase2019 オープニングセッション @ 石川 雅也
 
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
db tech showcase2019 オープニングセッション @ マイナー・アレン・パーカー
 
難しいアプリケーション移行、手軽に試してみませんか?
難しいアプリケーション移行、手軽に試してみませんか?難しいアプリケーション移行、手軽に試してみませんか?
難しいアプリケーション移行、手軽に試してみませんか?
 
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
Attunityのソリューションと異種データベース・クラウド移行事例のご紹介
 
そのデータベース、クラウドで使ってみませんか?
そのデータベース、クラウドで使ってみませんか?そのデータベース、クラウドで使ってみませんか?
そのデータベース、クラウドで使ってみませんか?
 
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
コモディティサーバー3台で作る高速処理 “ハイパー・コンバージド・データベース・インフラストラクチャー(HCDI)” システム『Insight Qube』...
 
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。 複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
複数DBのバックアップ・切り戻し運用手順が異なって大変?!運用性の大幅改善、その先に。。
 
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
Attunity社のソリューションの日本国内外適用事例及びロードマップ紹介[ATTUNITY & インサイトテクノロジー IoT / Big Data フ...
 
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
レガシーに埋もれたデータをリアルタイムでクラウドへ [ATTUNITY & インサイトテクノロジー IoT / Big Data フォーラム 2018]
 

Último

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 

Último (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

B17 Eliminating the database bottleneck

  • 1. Eliminating the Database Bottleneck What makes Vectorwise so fast Mark Van de Wiel Director Product Management, Vectorwise Thursday, November 01, 2012 1 of 9 1 of 9 Confidential © 2012 Actian Corporation
  • 2. Agenda Why traditional RDBMSs are slow for analytics Why Vectorwise is fast The I/O challenge Efficient updates Confidential © 2012 Actian Corporation 2
  • 3. 100x (+) Performance Difference – 2003 Custom C versus Relational Database TPC-H 1 GB query 1 (runtime in s) 30 28.1 26.2 25 20 MySQL 15 DBMS 'X' C program 10 Vectorwise 5 0.2 0.6 0 MySQL DBMS 'X' C program Vectorwise Confidential © 2012 Actian Corporation 3
  • 4. Traditional Relational Database for Analytics Inefficiencies Inefficient storage Inefficient processing Confidential © 2012 Actian Corporation 4
  • 5. Inefficient Storage for Analytics Row-based storage model Predominant in 2003, still very common today Works well for OLTP 101 Joe 27 Black 103 Edward 21 Scissorhand Confidential © 2012 Actian Corporation 5
  • 6. Inefficient Storage – Row-based Pages on disk – example 101 27 Joe Black 103 21 Edward Scissorhand Var-width attribute pointers pointers to tuples Confidential © 2012 Actian Corporation 6
  • 7. Issues with Row-based Storage Always read all attributes Poor bandwidth Poor use of memory buffer Complex row structure and navigation E.g. compressing out null fields E.g. row chaining Confidential © 2012 Actian Corporation 7
  • 8. Efficient Storage for Analytics Columnar storage: store attributes separtely Retrieve only attributes required by the query Used by “traditional” column stores, e.g. Sybase IQ, Vertica Confidential © 2012 Actian Corporation 8
  • 9. Inefficient Processing How a traditional database runs a query Query: SELECT name, salary*.19 AS tax FROM employee WHERE age > 25 Confidential © 2012 Actian Corporation 9
  • 10. Inefficient Processing How a traditional database runs a query Tuple-at-a-time iterator interface: - open() - next(): tuple - close() next() is called: - for each operator - for each tuple Complex code repeated over and over Confidential © 2012 Actian Corporation 10
  • 11. Inefficient Processing How a traditional database runs a query Data-specific computational functionality Called once for every operation on every tuple Worse for complex tuple representations Confidential © 2012 Actian Corporation 11
  • 12. Inefficient Processing (Part 1 of 2) Lots of repeated, unnecessary code Operator logic Function calls Attribute access Most instructions interpreting a query Very few instructions processing actual data! Many instructions per tuple Confidential © 2012 Actian Corporation 12
  • 13. CPU Features – Inefficient Processing Part 2 In the last 20 years… Chip cache because RAM access is too slow and congested Branch-sensitive CPU pipelines Superscalar features SIMD instructions (SSE and AVX) Great for multimedia processing, scientific computing… … but NOT for traditional relational databases Complex code: function calls, branches Poor use of CPU cache (both data and instructions) Processing one value at a time Confidential © 2012 Actian Corporation 13
  • 14. Inefficient Processing Traditional RDBMS Many instructions per tuple Many cycles per instruction Very many cycles per tuple Confidential © 2012 Actian Corporation 14
  • 15. Vectorwise – Vector-based Processing Query: SELECT name, salary*.19 AS tax FROM employee WHERE age > 25 Confidential © 2012 Actian Corporation 15
  • 16. Vectorwise – Vector-based Processing Vector contains data of multiple tuples (1024) All operations consume and produce entire vectors Effect: much less operator.next() and primitive calls. AND: pipelined query evaluation Confidential © 2012 Actian Corporation 16
  • 17. Why is Vectorwise so Fast? Reduced interpretation overhead 100+ times fewer function calls Good CPU cache use High locality in primitives Cache-conscious algorithms No tuple navigation Primitives only see arrays Vectorization allows algorithmic optimization CPU and compiler-friendly function bodies Multiple work units, loop-pipelining, SIMD… BONUS: PARALLEL QUERY Confidential © 2012 Actian Corporation 17
  • 18. Some Numbers Traditional RDBMS: <200 MB/s per core Vectorwise (lab environment): >1.5 GB/s per core Confidential © 2012 Actian Corporation 18
  • 19. Addressing the I/O Challenge Columnar storage Smart column buffer (memory) Data compression On disk: less I/O In memory: best use of column buffer Ultra-efficient decompression algorithms to get sufficient throughput Large contiguous data blocks for optimum disk I/O In-memory min-max indexes per block (i.e. per column) Eliminate data blocks based on implicit/explicit filter criteria Confidential © 2012 Actian Corporation 19
  • 20. Efficient Updates in a Column Store Positional Delta Trees (PDTs) In-memory representation of small data changes Efficiently merged with on-disk data Periodically propagated to disk Provide snapshot read consistency ACID compliant Confidential © 2012 Actian Corporation 20
  • 21. Agenda Why traditional RDBMSs are slow for analytics Why Vectorwise is fast The I/O challenge Efficient updates Confidential © 2012 Actian Corporation 21
  • 22. Confidential © 2012 Actian Corporation