SlideShare uma empresa Scribd logo
1 de 1
Reducing Cache Misses in Hash Join 
                                 Probing Phase By Pre‐sorting Strategy
                       Gi-Hwan Oh, Jae-Myung Kim, Woon-Hak Kang, and Sang-Won Lee


Background and Motivation
• Evolution of In‐memory Hash Join, |R| < |S| 
 ‐ Partitioning (Shatdal et al. VLDB`94)                                          ‐ Multi‐core Radix‐clustering (Kim et al. VLDB`09)
   ‐ |R|>> cache size                                                               ‐ Hash join  on multicore
   =>  |R.partition| < cache size                                                 ‐ No partitioning (revisited) (Blanas et al. SIGMOD`11)
 ‐ Radix‐clustering (Manegold et al. VLDB`00)                                       ‐ Multicore and skewed data
   ‐ # partitions >> TLB entries                                                    ‐ Cons : High cache miss ( 99% in probe phase )
   =>  Multi‐pass partitioning

Proposed Scheme: Pre-sorting
 Cache Miss in Probe Phase                                                                            S                         Hash
                                                                                                                                 Table       R
  ‐ Reference pattern on hash table is random




                                                                                                              In‐Cache
    ‐ Cause cache misses




                                                                                                                Sort
 Strategy: Sorting ‘S’ Relation by Join Attribute                                                                       Probe
  ‐ Change reference pattern of a hash table of relation ‘R’                                                                         Build
    ‐ Global random ‐> clustering in local scope (better temporal                                                                    Hash
                                                                                                                                     Table
      and spatial locality in cache access)
  ‐ Sorting all records of ‘S’ is unrealistic
    ‐ In‐cache sorting: sorting buffer size = the largest private cache                                 Benefits of This Strategy
  ‐ Maximize the number of records to sort                                                              ‐ Reduces cache misses
    ‐ Extract (key,rid) to reduce record size (like alpha sort)                                         ‐ Can be applied on any hash 
 Reduction cache miss >> Sorting overhead                                                                join algorithms
                                     Cycles                     Cache Misses
                Normal Data               183,850,677,579               105,341,310
        Fully Sorted Data                   27,370,971,180                 389,938




Performance Evaluation
                Cycles          Execution Time                                         Environmental Setting
                200                                                                    ‐ HW: Intel Core‐i7 860, 2.8GHz  (private 256K, 
     Billions




                                                                                         shared 8M), 12GB RAM
                150
                                                                                       ‐ OS: Linux  2.6.32‐220.7 (CentOS)
                100
                                                                                       ‐ Dataset: |R| = 16M, |S| = 256M, hash 
                                                                                         entries=1M, Schema R & S = (long, long), 
                50
                                                                                         Uniform distribution

                 0
                           IP       IP+SP           NP          NP+SP
                                                                                       Evaluation Result
                          PART    BUILD     PROBE        SORT                          ‐ Outperforms all other algorithms
                      • IP=Independent Partitioning                                    ‐ IP+SP is 30% faster than IP
                      • NP=No Partitioning                                             ‐ NP+SP is 30% faster than NP
                      • SP=Sorting and Probing

Mais conteúdo relacionado

Destaque

Data Step Hash Object vs SQL Join
Data Step Hash Object vs SQL JoinData Step Hash Object vs SQL Join
Data Step Hash Object vs SQL JoinGeoff Ness
 
Understanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
Understanding Query Optimization with ‘regular’ and ‘Exadata’ OracleUnderstanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
Understanding Query Optimization with ‘regular’ and ‘Exadata’ OracleGuatemala User Group
 
Oracle Database In-Memory Option for ILOUG
Oracle Database In-Memory Option for ILOUGOracle Database In-Memory Option for ILOUG
Oracle Database In-Memory Option for ILOUGZohar Elkayam
 
Microsoft SQL Server Physical Join Operators
Microsoft SQL Server Physical Join OperatorsMicrosoft SQL Server Physical Join Operators
Microsoft SQL Server Physical Join OperatorsMark Ginnebaugh
 
Oracle Database In-Memory Option in Action
Oracle Database In-Memory Option in ActionOracle Database In-Memory Option in Action
Oracle Database In-Memory Option in ActionTanel Poder
 
Unit iv(dsc++)
Unit iv(dsc++)Unit iv(dsc++)
Unit iv(dsc++)Durga Devi
 
13. Query Processing in DBMS
13. Query Processing in DBMS13. Query Processing in DBMS
13. Query Processing in DBMSkoolkampus
 

Destaque (10)

ch13
ch13ch13
ch13
 
Data Step Hash Object vs SQL Join
Data Step Hash Object vs SQL JoinData Step Hash Object vs SQL Join
Data Step Hash Object vs SQL Join
 
Understanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
Understanding Query Optimization with ‘regular’ and ‘Exadata’ OracleUnderstanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
Understanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
 
Hash join
Hash joinHash join
Hash join
 
Join operation
Join operationJoin operation
Join operation
 
Oracle Database In-Memory Option for ILOUG
Oracle Database In-Memory Option for ILOUGOracle Database In-Memory Option for ILOUG
Oracle Database In-Memory Option for ILOUG
 
Microsoft SQL Server Physical Join Operators
Microsoft SQL Server Physical Join OperatorsMicrosoft SQL Server Physical Join Operators
Microsoft SQL Server Physical Join Operators
 
Oracle Database In-Memory Option in Action
Oracle Database In-Memory Option in ActionOracle Database In-Memory Option in Action
Oracle Database In-Memory Option in Action
 
Unit iv(dsc++)
Unit iv(dsc++)Unit iv(dsc++)
Unit iv(dsc++)
 
13. Query Processing in DBMS
13. Query Processing in DBMS13. Query Processing in DBMS
13. Query Processing in DBMS
 

Semelhante a Reducing Cache Misses in Hash Join Probing Phase by Pre-sorting Strategy

Tajo: A Distributed Data Warehouse System for Hadoop
Tajo: A Distributed Data Warehouse System for HadoopTajo: A Distributed Data Warehouse System for Hadoop
Tajo: A Distributed Data Warehouse System for HadoopHyunsik Choi
 
Apache Hadoop YARN 3.x in Alibaba
Apache Hadoop YARN 3.x in AlibabaApache Hadoop YARN 3.x in Alibaba
Apache Hadoop YARN 3.x in AlibabaDataWorks Summit
 
No sql solutions - 공개용
No sql solutions - 공개용No sql solutions - 공개용
No sql solutions - 공개용Byeongweon Moon
 
Apache Drill talk ApacheCon 2018
Apache Drill talk ApacheCon 2018Apache Drill talk ApacheCon 2018
Apache Drill talk ApacheCon 2018Aman Sinha
 
DSL - Domain Specific Languages, Chapter 4, Internal DSL
DSL - Domain Specific Languages,  Chapter 4, Internal DSLDSL - Domain Specific Languages,  Chapter 4, Internal DSL
DSL - Domain Specific Languages, Chapter 4, Internal DSLHiro Yoshioka
 
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...buildacloud
 
Architecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric BaldeschwielerArchitecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric Baldeschwielerlucenerevolution
 
An introduction to apache drill presentation
An introduction to apache drill presentationAn introduction to apache drill presentation
An introduction to apache drill presentationMapR Technologies
 
Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconYiwei Ma
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统yongboy
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase强 王
 
DFX Architecture for High-performance Multi-core Microprocessors
DFX Architecture for High-performance Multi-core MicroprocessorsDFX Architecture for High-performance Multi-core Microprocessors
DFX Architecture for High-performance Multi-core MicroprocessorsIshwar Parulkar
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture EMC
 
Philly DB MapR M7 - March 2013
Philly DB MapR M7 - March 2013Philly DB MapR M7 - March 2013
Philly DB MapR M7 - March 2013MapR Technologies
 
PhillyDB Hbase and MapR M7 - March 2013
PhillyDB Hbase and MapR M7 - March 2013PhillyDB Hbase and MapR M7 - March 2013
PhillyDB Hbase and MapR M7 - March 2013PhillyDB
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for SysadminsNathan Milford
 
Microsoft R - Data Science at Scale
Microsoft R - Data Science at ScaleMicrosoft R - Data Science at Scale
Microsoft R - Data Science at ScaleSascha Dittmann
 
HBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationHBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationSchubert Zhang
 

Semelhante a Reducing Cache Misses in Hash Join Probing Phase by Pre-sorting Strategy (20)

Tajo: A Distributed Data Warehouse System for Hadoop
Tajo: A Distributed Data Warehouse System for HadoopTajo: A Distributed Data Warehouse System for Hadoop
Tajo: A Distributed Data Warehouse System for Hadoop
 
Apache Hadoop YARN 3.x in Alibaba
Apache Hadoop YARN 3.x in AlibabaApache Hadoop YARN 3.x in Alibaba
Apache Hadoop YARN 3.x in Alibaba
 
No sql solutions - 공개용
No sql solutions - 공개용No sql solutions - 공개용
No sql solutions - 공개용
 
Apache Drill talk ApacheCon 2018
Apache Drill talk ApacheCon 2018Apache Drill talk ApacheCon 2018
Apache Drill talk ApacheCon 2018
 
Hadoop Overview kdd2011
Hadoop Overview kdd2011Hadoop Overview kdd2011
Hadoop Overview kdd2011
 
DSL - Domain Specific Languages, Chapter 4, Internal DSL
DSL - Domain Specific Languages,  Chapter 4, Internal DSLDSL - Domain Specific Languages,  Chapter 4, Internal DSL
DSL - Domain Specific Languages, Chapter 4, Internal DSL
 
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...
 
Architecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric BaldeschwielerArchitecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric Baldeschwieler
 
Gpu Join Presentation
Gpu Join PresentationGpu Join Presentation
Gpu Join Presentation
 
An introduction to apache drill presentation
An introduction to apache drill presentationAn introduction to apache drill presentation
An introduction to apache drill presentation
 
Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qcon
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
 
DFX Architecture for High-performance Multi-core Microprocessors
DFX Architecture for High-performance Multi-core MicroprocessorsDFX Architecture for High-performance Multi-core Microprocessors
DFX Architecture for High-performance Multi-core Microprocessors
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
Philly DB MapR M7 - March 2013
Philly DB MapR M7 - March 2013Philly DB MapR M7 - March 2013
Philly DB MapR M7 - March 2013
 
PhillyDB Hbase and MapR M7 - March 2013
PhillyDB Hbase and MapR M7 - March 2013PhillyDB Hbase and MapR M7 - March 2013
PhillyDB Hbase and MapR M7 - March 2013
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for Sysadmins
 
Microsoft R - Data Science at Scale
Microsoft R - Data Science at ScaleMicrosoft R - Data Science at Scale
Microsoft R - Data Science at Scale
 
HBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationHBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance Evaluation
 

Mais de Jaemyung Kim

Write Amplification: An Analysis of In-Memory Database Durability Techniques
Write Amplification: An Analysis of In-Memory Database Durability TechniquesWrite Amplification: An Analysis of In-Memory Database Durability Techniques
Write Amplification: An Analysis of In-Memory Database Durability TechniquesJaemyung Kim
 
Database High Availability Using SHADOW Systems
Database High Availability Using SHADOW SystemsDatabase High Availability Using SHADOW Systems
Database High Availability Using SHADOW SystemsJaemyung Kim
 
Altibase DSM: CTable for Pull-based Processing in SPE
Altibase DSM: CTable for Pull-based Processing in SPEAltibase DSM: CTable for Pull-based Processing in SPE
Altibase DSM: CTable for Pull-based Processing in SPEJaemyung Kim
 
Implementation of Bitmap based Incognito and Performance Evaluation
Implementation of Bitmap based Incognito and Performance EvaluationImplementation of Bitmap based Incognito and Performance Evaluation
Implementation of Bitmap based Incognito and Performance EvaluationJaemyung Kim
 
IPL 기법의 인덱스 연산 분석
IPL 기법의 인덱스 연산 분석IPL 기법의 인덱스 연산 분석
IPL 기법의 인덱스 연산 분석Jaemyung Kim
 
정보과학회 FTL논문 아이디어
정보과학회 FTL논문 아이디어정보과학회 FTL논문 아이디어
정보과학회 FTL논문 아이디어Jaemyung Kim
 

Mais de Jaemyung Kim (6)

Write Amplification: An Analysis of In-Memory Database Durability Techniques
Write Amplification: An Analysis of In-Memory Database Durability TechniquesWrite Amplification: An Analysis of In-Memory Database Durability Techniques
Write Amplification: An Analysis of In-Memory Database Durability Techniques
 
Database High Availability Using SHADOW Systems
Database High Availability Using SHADOW SystemsDatabase High Availability Using SHADOW Systems
Database High Availability Using SHADOW Systems
 
Altibase DSM: CTable for Pull-based Processing in SPE
Altibase DSM: CTable for Pull-based Processing in SPEAltibase DSM: CTable for Pull-based Processing in SPE
Altibase DSM: CTable for Pull-based Processing in SPE
 
Implementation of Bitmap based Incognito and Performance Evaluation
Implementation of Bitmap based Incognito and Performance EvaluationImplementation of Bitmap based Incognito and Performance Evaluation
Implementation of Bitmap based Incognito and Performance Evaluation
 
IPL 기법의 인덱스 연산 분석
IPL 기법의 인덱스 연산 분석IPL 기법의 인덱스 연산 분석
IPL 기법의 인덱스 연산 분석
 
정보과학회 FTL논문 아이디어
정보과학회 FTL논문 아이디어정보과학회 FTL논문 아이디어
정보과학회 FTL논문 아이디어
 

Reducing Cache Misses in Hash Join Probing Phase by Pre-sorting Strategy

  • 1. Reducing Cache Misses in Hash Join  Probing Phase By Pre‐sorting Strategy Gi-Hwan Oh, Jae-Myung Kim, Woon-Hak Kang, and Sang-Won Lee Background and Motivation • Evolution of In‐memory Hash Join, |R| < |S|  ‐ Partitioning (Shatdal et al. VLDB`94) ‐ Multi‐core Radix‐clustering (Kim et al. VLDB`09) ‐ |R|>> cache size ‐ Hash join  on multicore =>  |R.partition| < cache size ‐ No partitioning (revisited) (Blanas et al. SIGMOD`11) ‐ Radix‐clustering (Manegold et al. VLDB`00) ‐ Multicore and skewed data ‐ # partitions >> TLB entries ‐ Cons : High cache miss ( 99% in probe phase ) =>  Multi‐pass partitioning Proposed Scheme: Pre-sorting  Cache Miss in Probe Phase S Hash Table R ‐ Reference pattern on hash table is random In‐Cache ‐ Cause cache misses Sort  Strategy: Sorting ‘S’ Relation by Join Attribute Probe ‐ Change reference pattern of a hash table of relation ‘R’ Build ‐ Global random ‐> clustering in local scope (better temporal  Hash Table and spatial locality in cache access) ‐ Sorting all records of ‘S’ is unrealistic ‐ In‐cache sorting: sorting buffer size = the largest private cache  Benefits of This Strategy ‐ Maximize the number of records to sort ‐ Reduces cache misses ‐ Extract (key,rid) to reduce record size (like alpha sort) ‐ Can be applied on any hash   Reduction cache miss >> Sorting overhead join algorithms Cycles Cache Misses Normal Data 183,850,677,579 105,341,310 Fully Sorted Data 27,370,971,180 389,938 Performance Evaluation Cycles Execution Time  Environmental Setting 200 ‐ HW: Intel Core‐i7 860, 2.8GHz  (private 256K,  Billions shared 8M), 12GB RAM 150 ‐ OS: Linux  2.6.32‐220.7 (CentOS) 100 ‐ Dataset: |R| = 16M, |S| = 256M, hash  entries=1M, Schema R & S = (long, long),  50 Uniform distribution 0 IP IP+SP NP NP+SP  Evaluation Result PART BUILD PROBE SORT ‐ Outperforms all other algorithms • IP=Independent Partitioning ‐ IP+SP is 30% faster than IP • NP=No Partitioning ‐ NP+SP is 30% faster than NP • SP=Sorting and Probing