2. SQL-on-Hadoop Solutions
2008
Hive
Developed by Facebook
– Hive is used for data analysis in their data warehouse
– DWH size is ~300PB at the moment, ~600TB of data is loaded daily. Data
is compressed using ORCFiles, compression ratio is ~8x
HiveQL language is not compatible with ANSI SQL-92
Has many limitations on subqueries
Cost-based optimizer (Optiq) is only in technical preview now
Pivotal Confidential–Internal Use Only 2
3. SQL-on-Hadoop Solutions
2008
Hive
Developed by Cloudera
10.2012
Impala
– Open-source solution
– Cloudera sells this solution to enterprise shops
– Was in beta until the May’2013
Supports HiveQL, moving forward complete ANSI SQL-92 support
Written in C++, does not use Map-Reduce for running queries
Requires much memory, big tables join usually causes OOM error
Pivotal Confidential–Internal Use Only 3
4. SQL-on-Hadoop Solutions
2008
Hive
Hortonworks initiative
10.2012
Impala
02.2013
Stinger
– Consists of a number of steps to make Hive run 100x faster
Tez – solution to make Hive queries be translated to Tez jobs, which are
similar to Map-Reduce but may have arbitrary topology
Optiq – cost-based query optimizer for Hive (technical preview ATM)
ORCFile – columnar storage format with adaptive compression and
inline indexes
Hive-5317 – ACID and Update/Delete support (release at ~ 11.2014)
Pivotal Confidential–Internal Use Only 4
5. SQL-on-Hadoop Solutions
2008
Hive
Pivotal product
10.2012
Impala
02.2013
Stinger
02.2013
HAWQ
– Greenplum MPP DBMS, ported to store data in HDFS
– Written in C, query optimizer is rewritten for this solution (ORCA)
Supports ANSI SQL-92 and analytic extensions from SQL-2003
Supports complex queries with correlated subqueries, window functions
and different joins
Data is put on disk only if the process does not have enough memory
Pivotal Confidential–Internal Use Only 5
6. SQL-on-Hadoop Solutions
2008
Hive
HP Vertica
10.2012
Impala
02.2013
Stinger
02.2013
HAWQ
– Supports only MapR distribution as requires updatable storage
– Supports ANSI SQL-92, SQL-2003
– Supports UPDATE/DELETE
– Officially announced as available in July’2014, no implementations yet
IBM BigSQL v3
– IBM DB2 ported to store data in HDFS
– Federated queries, good query optimizer, etc.
Both solutions are similar to Pivotal HAWQ in general idea
2014
Vertica,
BigSQL
Pivotal Confidential–Internal Use Only 6
7. Pivotal HAWQ Components
Master
Server 1
Server 3
Segment 1
Segment 2
…
Segment K
Standby
Master
Server 2
Server 4
Segment K+1
Segment K+2
…
Segment 2*K
Server M
…
Segment N
…
Pivotal Confidential–Internal Use Only 7
8. Pivotal HAWQ Components
Server 1
HAWQ Master
Server 2
ZK QJM ZK QJM ZK QJM
HAWQ SBMstr
Server 5
Datanode
HAWQ Segm.
Server 3
NameNode
…
Server 4
SNameNode
Server 6
Datanode
HAWQ Segm.
Server M
Datanode
HAWQ Segm.
Pivotal Confidential–Internal Use Only 8
9. Pivotal HAWQ Components
HAWQ Master
Query Parser
Query Optimizer
Query Executor
Transaction
Manager
Metadata
Catalog
Process
Manager
HAWQ Standby Master
Query Parser
Query Optimizer
Query Executor
Transaction
Manager
Metadata
Catalog
Process
Manager
WAL
replic.
Pivotal Confidential–Internal Use Only 9
10. Pivotal HAWQ Components
Metadata is stored only on master-servers
Metadata is stored in modified Postgres instance, replicated
to standby master with WAL
Metadata contains
– Table information – schema, names, files
– Statistics – number of unique values, value ranges, sample values,
etc.
– Information about users, groups, priorities, etc.
Master server shutdown causes the switch to standby with
the loss of running sessions
Pivotal Confidential–Internal Use Only 10
11. Pivotal HAWQ Components
HAWQ Segment
Query Executor
libhdfs3
PXF
HDFS Datanode
Segment Data Directory
Local Filesystem (xfs)
Spill Data Directory
Pivotal Confidential–Internal Use Only 11
12. Pivotal HAWQ Components
Both masters and segments are modified postgres
instances (to be clear, modified Greenplum instances)
Opening connection to the master server you fork
postmaster process that starts to work with your session
Starting the query execution you connect to the segment
instances and they also fork a process to execute query
Query execution plan is split into independent blocks
(slices), each of them is executed as a separate OS process
on the segment server, moving the data through UDP
Pivotal Confidential–Internal Use Only 12
13. Pivotal HAWQ Components
Tables can be stored as:
– Row-oriented (quicklz, zlib compression)
– Column-oriented (quicklz, zlib, rle compression)
– Parquet tables
Each segment has separate directory on HDFS where it
stores its data shard
Within columnar storage each column is represented as a
separate file
Parquet allows to store the table by columns and does not
load NameNode with many files / block location requests
Pivotal Confidential–Internal Use Only 13
14. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
HDFS Datanode
Segment Directory
Local Spill Directory
HAWQ Segment
Backend
HDFS Datanode
Segment Directory
Local Spill Directory
Pivotal Confidential–Internal Use Only 14
15. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
HDFS Datanode
Segment Directory
Local Spill Directory
HAWQ Segment
Backend
HDFS Datanode
Segment Directory
Local Spill Directory
Pivotal Confidential–Internal Use Only 15
16. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
HDFS Datanode
Segment Directory
Local Spill Directory
HAWQ Segment
Backend
HDFS Datanode
Segment Directory
Local Spill Directory
Pivotal Confidential–Internal Use Only 16
17. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
HDFS Datanode
Segment Directory
Local Spill Directory
MotionGather
Projects.beer, s.price
HashJoinb.name = s.bar
MotionRedist(b.name)
s Filterb.city = 'San Francisco'
b
ScanBars
HAWQ Segment
Backend
HDFS Datanode
Segment Directory
Local Spill Directory
ScanSells
Pivotal Confidential–Internal Use Only 17
18. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
HDFS Datanode
Segment Directory
Local Spill Directory
MotionGather
Projects.beer, s.price
HashJoinb.name = s.bar
MotionRedist(b.name)
s Filterb.city = 'San Francisco'
b
ScanBars
HAWQ Segment
Backend
HDFS Datanode
Segment Directory
Local Spill Directory
ScanSells
Pivotal Confidential–Internal Use Only 18
19. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
HDFS Datanode
Segment Directory
Local Spill Directory
MotionGather
Projects.beer, s.price
HashJoinb.name = s.bar
MotionRedist(b.name)
s Filterb.city = 'San Francisco'
b
ScanBars
HAWQ Segment
Backend
HDFS Datanode
Segment Directory
Local Spill Directory
ScanSells
Pivotal Confidential–Internal Use Only 19
20. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
HDFS Datanode
Segment Directory
Local Spill Directory
MotionGather
Projects.beer, s.price
HashJoinb.name = s.bar
MotionRedist(b.name)
s Filterb.city = 'San Francisco'
b
ScanBars
HAWQ Segment
Backend
HDFS Datanode
Segment Directory
Local Spill Directory
ScanSells
Pivotal Confidential–Internal Use Only 20
21. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
QE
HDFS Datanode
Segment Directory
Local Spill Directory
MotionGather
Projects.beer, s.price
HashJoinb.name = s.bar
MotionRedist(b.name)
s Filterb.city = 'San Francisco'
b
ScanBars
HAWQ Segment
Backend
QE
HDFS Datanode
Segment Directory
Local Spill Directory
ScanSells
Pivotal Confidential–Internal Use Only 21
22. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
QE
HDFS Datanode
Segment Directory
Local Spill Directory
MotionGather
Projects.beer, s.price
HashJoinb.name = s.bar
MotionRedist(b.name)
s Filterb.city = 'San Francisco'
b
ScanBars
HAWQ Segment
Backend
QE
HDFS Datanode
Segment Directory
Local Spill Directory
ScanSells
Pivotal Confidential–Internal Use Only 22
23. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
MotionGather
Projects.beer, s.price
HashJoinb.name = s.bar
MotionRedist(b.name)
s Filterb.city = 'San Francisco'
b
ScanBars
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
ScanSells
Pivotal Confidential–Internal Use Only 23
24. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
MotionGather
Projects.beer, s.price
HashJoinb.name = s.bar
MotionRedist(b.name)
s Filterb.city = 'San Francisco'
b
ScanBars
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
ScanSells
Pivotal Confidential–Internal Use Only 24
25. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
MotionGather
Projects.beer, s.price
HashJoinb.name = s.bar
MotionRedist(b.name)
s Filterb.city = 'San Francisco'
b
ScanBars
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
ScanSells
Pivotal Confidential–Internal Use Only 25
26. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
MotionGather
Projects.beer, s.price
HashJoinb.name = s.bar
MotionRedist(b.name)
s Filterb.city = 'San Francisco'
b
ScanBars
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
ScanSells
Pivotal Confidential–Internal Use Only 26
27. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
MotionGather
Projects.beer, s.price
HashJoinb.name = s.bar
MotionRedist(b.name)
s Filterb.city = 'San Francisco'
b
ScanBars
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
ScanSells
Pivotal Confidential–Internal Use Only 27
28. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
MotionGather
Projects.beer, s.price
HashJoinb.name = s.bar
MotionRedist(b.name)
s Filterb.city = 'San Francisco'
b
ScanBars
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
ScanSells
Pivotal Confidential–Internal Use Only 28
29. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
MotionGather
Projects.beer, s.price
HashJoinb.name = s.bar
MotionRedist(b.name)
s Filterb.city = 'San Francisco'
b
ScanBars
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
ScanSells
Pivotal Confidential–Internal Use Only 29
30. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
MotionGather
Projects.beer, s.price
HashJoinb.name = s.bar
MotionRedist(b.name)
s Filterb.city = 'San Francisco'
b
ScanBars
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
ScanSells
Pivotal Confidential–Internal Use Only 30
31. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
MotionGather
Projects.beer, s.price
HashJoinb.name = s.bar
MotionRedist(b.name)
s Filterb.city = 'San Francisco'
b
ScanBars
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
ScanSells
Pivotal Confidential–Internal Use Only 31
32. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
MotionGather
Projects.beer, s.price
HashJoinb.name = s.bar
MotionRedist(b.name)
s Filterb.city = 'San Francisco'
b
ScanBars
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
ScanSells
Pivotal Confidential–Internal Use Only 32
33. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
MotionGather
Projects.beer, s.price
HashJoinb.name = s.bar
MotionRedist(b.name)
s Filterb.city = 'San Francisco'
b
ScanBars
HAWQ Segment
Backend
QE S1 S2 S3
HDFS Datanode
Segment Directory
Local Spill Directory
ScanSells
Pivotal Confidential–Internal Use Only 33
34. Query Execution in Pivotal HAWQ
HAWQ Master
Parser Query Optimiz.
Metadata
Transact. Mgr.
Process Mgr.
Query Executor
NameNode
HAWQ Segment
Backend
HDFS Datanode
Segment Directory
Local Spill Directory
HAWQ Segment
Backend
HDFS Datanode
Segment Directory
Local Spill Directory
Pivotal Confidential–Internal Use Only 34
35. PXF Framework
Gives you ability to read different data types from HDFS
– Text files, both compressed and uncompressed
– Seqence-files
– AVRO-files
Able to read data from external data sources
– HBase
– Cassandra
– Redis
Extensible API
Pivotal Confidential–Internal Use Only 35
36. NameNode
PXF Framework
HAWQ Master
PXF Fragmenter
Process Mgr.
HAWQ Segment
Query Executor
PXF Accessor
PXF Fragmenter
HDFS Datanode
Segment Directory
Local Spill Directory
HAWQ Segment
Query Executor
PXF Accessor
PXF Fragmenter
HDFS Datanode
Segment Directory
Local Spill Directory
Pivotal Confidential–Internal Use Only 36
37. NameNode
PXF Framework
HAWQ Master
PXF Fragmenter
Process Mgr.
HAWQ Segment
Query Executor
PXF Accessor
PXF Fragmenter
HDFS Datanode
Segment Directory
Local Spill Directory
HAWQ Segment
Query Executor
PXF Accessor
PXF Fragmenter
HDFS Datanode
Segment Directory
Local Spill Directory
Pivotal Confidential–Internal Use Only 37
38. NameNode
PXF Framework
HAWQ Master
PXF Fragmenter
Process Mgr.
HAWQ Segment
Query Executor
PXF Accessor
PXF Fragmenter
HDFS Datanode
Segment Directory
Local Spill Directory
HAWQ Segment
Query Executor
PXF Accessor
PXF Fragmenter
HDFS Datanode
Segment Directory
Local Spill Directory
Pivotal Confidential–Internal Use Only 38
39. NameNode
PXF Framework
HAWQ Master
PXF Fragmenter
Process Mgr.
HAWQ Segment
Query Executor
PXF Accessor
PXF Fragmenter
HDFS Datanode
Segment Directory
Local Spill Directory
HAWQ Segment
Query Executor
PXF Accessor
PXF Fragmenter
HDFS Datanode
Segment Directory
Local Spill Directory
Pivotal Confidential–Internal Use Only 39
40. NameNode
PXF Framework
HAWQ Master
PXF Fragmenter
Process Mgr.
HAWQ Segment
Query Executor
PXF Accessor
PXF Fragmenter
HDFS Datanode
Segment Directory
Local Spill Directory
HAWQ Segment
Query Executor
PXF Accessor
PXF Fragmenter
HDFS Datanode
Segment Directory
Local Spill Directory
Pivotal Confidential–Internal Use Only 40
41. NameNode
PXF Framework
HAWQ Master
PXF Fragmenter
Process Mgr.
HAWQ Segment
Query Executor
PXF Accessor
PXF Fragmenter
HDFS Datanode
Segment Directory
Local Spill Directory
HAWQ Segment
Query Executor
PXF Accessor
PXF Fragmenter
HDFS Datanode
Segment Directory
Local Spill Directory
Pivotal Confidential–Internal Use Only 41
42. NameNode
PXF Framework
HAWQ Master
PXF Fragmenter
Process Mgr.
HAWQ Segment
Query Executor
PXF Accessor
PXF Fragmenter
HDFS Datanode
Segment Directory
Local Spill Directory
HAWQ Segment
Query Executor
PXF Accessor
PXF Fragmenter
HDFS Datanode
Segment Directory
Local Spill Directory
Pivotal Confidential–Internal Use Only 42
43. NameNode
PXF Framework
HAWQ Master
PXF Fragmenter
Process Mgr.
HAWQ Segment
Query Executor
PXF Accessor
PXF Fragmenter
HDFS Datanode
Segment Directory
Local Spill Directory
HAWQ Segment
Query Executor
PXF Accessor
PXF Fragmenter
HDFS Datanode
Segment Directory
Local Spill Directory
Pivotal Confidential–Internal Use Only 43
44. Further Steps
Master server scaling – pool of master servers
New native data storage formats and new native
compression algorithms
YARN as resource manager for HAWQ
Dynamic segment allocation / decommission
Pivotal Confidential–Internal Use Only 44