Introducing Apache Kudu (incubating

1© Cloudera, Inc. All rights reserved.
Introducing Apache Kudu
(incubating)
Mladen Kovacevic | Solutions Architect
May 2016
Adapted from Jeremy Beard’s original presentation in New York, Dec 2015

Presenter
• Mladen Kovacevic
• Solutions Architect at Cloudera (Toronto based)
• 2+ yrs in Big Data
• 3+ yrs building workload optimized systems, exploiting hardware features to
improve software performance and capability (compute, storage, networking)
• 6+ yrs engine development enterprise RDBMS
mladen@cloudera.com

Current storage landscape in Hadoop
HDFS (GFS) excels at:
• Scanning large amounts of data
• Accumulating data with high
throughput
HBase (BigTable) excels at:
• Efficiently finding and writing
individual rows
• Making data mutable
Gaps exist when these properties
are needed simultaneously

Managing the gap (today)
Code Complexity
• Manage flow and sync of data between HDFS and HBase
Monitoring and Security
• Managing consistent backups, security policies, monitoring and more is hard
Performance
• Significant lag between arrival of HBase data “staging” and time when data is
available for analytics.

Hardware landscape is changing
rethink design!
• Spinning disk (HDD) -> solid state storage (SSD)
• NAND flash: Up to 450k read 250k write IOPS, about 2GB/sec read and 1.5GB/sec
write throughput, at a price of less than $3/GB and dropping
• 3D XPoint memory – persistent storage (1000x faster than NAND, cheaper than RAM)
• RAM is cheaper and more abundant:
• 64->128->256GB over last few years
• Takeaway 1: The next bottleneck is CPU, and current storage heavy applications weren’t
designed with CPU efficiency in mind
• Takeaway 2: Column stores are feasible for random access

Apache Kudu (Incubating)
Storage for Fast Analytics on Fast Data
• New updatable column store for Hadoop
• Simplifies the architecture for building analytic
applications on changing data
• Designed for fast analytic performance
• Natively integrated with Hadoop
• Donated as incubating project at
Apache Software Foundation (November
17, 2015)
• Beta v0.8.0 now available
STRUCTURED
Sqoop
UNSTRUCTURED
Kafka, Flume
PROCESS, ANALYZE, SERVE
UNIFIED SERVICES
RESOURCE MANAGEMENT
YARN
SECURITY
Sentry, RecordService
FILESYSTEM
HDFS
RELATIONAL
Kudu
NoSQL
HBase
STORE
INTEGRATE
BATCH
Spark, Hive, Pig
MapReduce
STREAM
Spark
SQL
Impala
SEARCH
Solr
SDK
Kite

• High throughput for big scans (columnar storage and
replication)
Goal: Within 2x of Parquet
• Low-latency for short accesses (primary key indexes
and quorum design)
Goal: 1ms read/write on SSD
• Database-like semantics (initially single-row ACID)
• Relational data model (tables/schemas)
• Easily integrate with SQL engines (Impala, Drill, Hive)
• “NoSQL” style scan/insert/update (Java, C++, Python
clients)
Kudu design goals

What Kudu is not
• Not a SQL interface
• Just the storage layer
• “BYOSQL” – Bring-your-own SQL
• Not a file system
• Data must have tabular structure
• Not an application that runs on HDFS
• An alternative, native Hadoop storage engine
• Not a replacement for HDFS or HBase
• Select the right storage for the right use case
• Cloudera will continue to support and invest in all three

What Kudu is
STORAGE SYSTEM
for
TABLES
of
STRUCTURED DATA
No SQL engine, nor query planner, nor
optimizer.
Storage system that lets you get at
your data quickly (random/scan).
Highly scalable, with ability to provide
higher level systems info to exploit its
abilities (ie. locality).
Data logically presented to clients as
rows and finite set of columns
Stored in columnar format
Tables are partitioned across tablets
managed by tablet servers.
Tablets are replicated managed by Raft
consensus.
Strongly typed columns
Finite number of columns
Add/remove columns solely by ALTER
TABLE statements

About Kudu
• Apache-licensed open source software
• Structured data model
• Basic construct: tables
• Tables broken down into tablets (roughly equivalent to partitions)
• Architecture supports geographically disparate, active/active systems
• Not the initial design goal

Kudu data model
• Tables have a RDBMS-like schema
• Finite number of columns (unlike HBase/Cassandra)
• Types: BOOL, INT8/16/32/64, FLOAT, DOUBLE, STRING, BINARY, TIMESTAMP
• Some subset of columns make up a primary key
• Fast random reads/writes by primary key
• No secondary indexes (yet)
• Columnar layout on disk – Parquet-like
• Lazy materialization
• Encoding and compression options
11

Table partitioning
• Hash bucketing
• Distribute records by hash of partition column(s)
• N buckets leads to N tablets
• Range partitioning
• Distribute records by ranges of the partition column(s)
• N split keys leads to N tablets
• Can be a mix for different columns of the primary key

Kudu tables are partitioned into “tablets” (regions)
Partitioning based on
primary key (PK)
• Native support for
range partitioning
and/or hash
partitioning (salting)
• Hash example:
PRIMARY KEY
(tweet_id) DISTRIBUTE
BY HASH(tweet_id)
INTO 100 BUCKETS

Each tablet is fault tolerant via Raft consensus
• A single Tablet Server can
host many tablets
• Metadata is stored on just
another tablet, but only
Master Server processes
host that tablet
• Raft consensus:
• Strong consistency
• Leader election on failure
• Replication factor 3 or 5 is
typical

Consistency model
• Consistency and replication enforced by Raft consensus (similar to Paxos)
• Replication by operation not data
• Single-row transactions now
• Multi-row transactions later
• Geo-distributed replicas will be possible under strict time synchronization
• Techniques drawn from Google Spanner and others

Kudu interfaces
• NoSQL-style APIs
• Insert(), Update(), Delete(), Scan()
• Java, C++, limited Python
• Integrations with MapReduce, Spark, and Impala
• No direct access to underlying Kudu tablet files
• Beta does not have authentication, authorization, encryption

Impala integration
• Opens up Kudu to JDBC/ODBC clients
• Intuitive way to get data into Kudu
• INSERT INTO kudu_table SELECT * FROM src_table;
• Additional commands
• UPDATE
• DELETE
• Efficient INSERT VALUES
• Runs on the Kudu C++ client

Performance characteristics
Very CPU efficient
• Written in modern C++, uses specialized CPU instructions (SIMD), JIT
compilation with LLVM
Latency dependent on storage hardware capabilities
• Expect sub-millisecond response on SSDs and upcoming technologies
No garbage collection allows very large memory footprint with no pauses
Bloom filters reduce the need for many disk accesses

Operating Kudu
• Easiest through Cloudera Manager integration
• Separate parcel for now
• Kudu is always compacting
• No minor vs. major compaction
• No compaction latency spikes
• Web UI is full of metrics and logs

Cluster layout
• One or multiple masters for failover
• Only one in current beta
• Low CPU and memory impact
• One tablet server per worker node
• Can share disks with HDFS
• One SSD per worker node just for Kudu WAL can speed up writes
• No dependencies on other Hadoop ecosystem components
• But interfacing components like Impala or Spark do

Kudu: Cluster metadata management
• Replicated master
• Acts as a tablet directory
• Acts as a catalog (which tables exist, etc)
• Acts as a load balancer (tracks TS liveness, re-replicates under-replicated
tablets)
• Caches all metadata in RAM for high performance
• Under heavy load, 99.99% response times still in microseconds (650us)
• Client configured with master addresses
• Client asks master for tablet locations as needed and caches them locally

Real-time analytics in Hadoop today
Merging in new data = storage complexity
Downsides:
● Multiple storage layers
● Latest data is hidden
● Files are messy
● Complex to do updates
without breaking running
queriesNew Partition
Most Recent Partition
Historic Data
HBase
Parquet
File
Have we
accumulated
enough data?
Reorganize
HBase file
into Parquet
• Wait for running operations to complete
• Define new Impala partition referencing
the newly written Parquet file
Incoming Data
(Messaging
System)
Reporting
Request
HDFS + Impala

Real-time analytics in Hadoop with Kudu
Improvements:
● One system to operate
● No schedules or
background processes
● Handle late arrivals or data
corrections with ease
● New data available
immediately for analytics
or operations
Historical and Real-time
Data
Incoming Data
(Messaging
System)
Reporting
Request
Kudu + Impala

Kudu for data warehousing
• Near real time data visibility
• BI tools can display events that happened seconds earlier
• Excellent for star schemas
• Fast scans of deep fact tables
• Efficient wide fact tables
• Simplified updates of slowly changing dimensions

Near real time data warehousing on Kudu
Files
RDBMS
Streams
K
A
F
K
A
K
U
D
U
IMPALA
HUE
BI tools
User
FLUME
SPARK
STREAMING
Simple
Complex

Resources
Join the community
http://getkudu.io
kudu-user@googlegroups.com
Download the beta
cloudera.com/downloads
Read the whitepaper
getkudu.io/kudu.pdf

Demo
• Load a standard HDFS Parquet table
• Create a Kudu table via Impala
• INSERT/UPDATE/DELETE statement with Impala
• Spark insert into Kudu table sample job
• Reading from Parquet files as input
• Map task to insert each row into Kudu

Creating a Kudu table - Impala
Kudu Storage handler
Table name in Impala
does NOT match table
name in Kudu. Kudu is
its own storage layer.
Kudu Master hostname and port
A primary key is mandatory

Spark (Scala) code – Java APIDataFrame Row
Kudu Master hostname and port
Kudu table name
Create a client, session and table object
Extract values from the row, strong types
Create an insert object and row
Set the values by type, column name and column valule
Perform the actual insert
Cleanup

Kudu with Spark DataFrames
Tie in Kudu with Spark
Map of the Kudu table name, and Kudu master
Table name to refer to within Spark
SparkSQL statement

Kudu code examples and docs
https://github.com/cloudera/kudu-examples
http://www.cloudera.com/documentation/betas/kudu/0-7-
0/topics/kudu_development.html
http://getkudu.io/docs/developing.html

Thank you
mladen@cloudera.com

Introducing Apache Kudu (incubating

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (20)

Semelhante a Introducing Apache Kudu (incubating

Semelhante a Introducing Apache Kudu (incubating (20)

Último

Último (20)

Introducing Apache Kudu (incubating

Notas do Editor