Extending Spark for Qbeast's SQL Data Source with Paola Pardo and Cesare Cugnasco

Extending Spark
for Qbeast's SQL
Data Source
with Paola Pardo and
Cesare Cugnasco
BarcelonaSpark Meetup
24th of October 2019

From the research to the industry

At first it was Extraction Transformation Loading
Hybrid Transactional Analytical Processing

Then the Lambda architecture tried to reduce latency
Hybrid Transactional Analytical Processing

A plot of the relative
bandwidth of system
components in the Titan
supercomputer at the Oak
Ridge Leadership Class Facility. Source: Bauer, Andrew C., et
al. "In situ methods,
infrastructures, and
applications on high
performance computing
platforms."
5

Consistent and transactional (at various
degree) level
Storage:
● Memory
● Local storage
Big Data HTAP: general design
Fast consistent layer
Weak consistency, high-latency,
immutable files
Storage:
• No-POSIX distributed file system
• Object Stores
Cheap/throughput layer
On-demand resources - decoupled
storage/ CPU
Temporary storage:
• Local disk
• Object Stores.
Query execution
Data ingestion
Periodical flushes
Data

Examples
Google’s Procella Snowflakes

Big Data HTAP: min-max pruning, zone maps, bloom filters..
Primary key partition A
Primary key partition B
Meta
Min/max
Bloom
range
Metadata server
Meta
Min/max
Bloom
range
Meta
Min/max
Bloom
range
Meta
Min/max
Bloom
range
June May
MarchJune

15Image credit: Nemo Jantzen Lucky Me, 2015, Photography, acrylic, and glass spheres on wooden canvas

16
Image credit: Nemo Jantzen Lucky Me, 2015, Photography, acrylic, and glass spheres on wooden canvas
960 KB 7 KB

18
High-priority
Medium-priority
Low-priority
RAM
Persistent
memory
Local disk
Object storage
Cold storage

QDB: file layout
Original data OutlookTree
Metadata and buffer
in fast storage
Data in columnar format
in slower storage

Hybrid columnar row
Row data
Disk
S3
Optane
Columnar to row mapping base on
the fact that the
random priority = DHT token

Interactive Big Data Visualization

● Overview
○ Catalyst Optimizer
○ APIs
○ Spark-Cassandra
● Extensions
○ SamplingPushdown
○ Multidimensional Filter Pushdown
● Future work
Outline

Overview
● CatalystOptimizer
● DataSources APIs
○ Key Concepts
○ Examples
● Spark-Cassandra-Connector
○ CassandraSourceRelation

25
User Query
SELECT sum(v)
FROM
SELECT t1.id, t1.value+1+2 AS v
FROM t1 JOIN t2
WHERE
(t1.id == t2.id AND t2.id > 50)
● Expressions
○ New value computedon input values
● Attributes
○ Column of a data collection
○ Dataset,Data Operation

26
Unresolved Plan
PROJECT
FILTER
JOIN
UnresolvedRelation t1 UnresolvedRelation t2
SELECT sum(v)
FROM
SELECT t1.id, t1.value+1+2 AS v
FROM t1 JOIN t2
WHERE
(t1.id == t2.id AND t2.id > 50)
AGG

27
Analysis
JOIN
UnresolvedRelation t1 UnresolvedRelation t2
JOIN
MyCustomRelation t1 MyCustomRelation t2
Metadata

● Tree
○ Abstraction of users program
○ Node objects
● Rules
○ Transform the tree
○ Logical Optimization
○ Heuristics
Logical Plan
SELECT t1.value+1+2 AS v
ADD
ADDT1.value
Literal(1) Literal(2)

29
Optimized Logical Plan
ADD
ADDT1.value
Literal(1) Literal(2)
ADD
Literal(3)T1.value

30
Physical Planning
● Strategies
○ Set of transformations
○ Eg: selects the best Join execution
● Rule executor
○ Ensure requirements
○ Apply optimization

31
Physical Planning
● Strategies
○ Set of transformations
○ Eg: selects the best Join execution
● Rule executor
○ Ensure requirements
○ Apply physicaloptimization

● Key part to integrate datasources
○ How to read/writefrom/tostorage
○ Statistics
○ Physical Planning
● Hadoop, Hive
● Presto and Cassandra connectors
DataSource API
API

DataSource API
trait RelationProvider {
def createRelation
(sqlContext:SQLContext,
parameters: Map[String, String]):
BaseRelation
}
abstract class BaseRelation {
def sqlContext: SQLContext
def schema: StructType
def unhandledFilters: Array[Filter]
def sizeInBytes: Long
def needConversion: Boolean
}
trait TableScan {
def buildScan(): RDD[Row]
}
org.apache.spark.sql.sources.interfaces

class DefaultSource extends RelationProvider with
SchemaRelationProvider {
override def createRelation(sqlContext: SQLContext,
parameters: Map[String, String])
: BaseRelation = {
createRelation(sqlContext, parameters, null)
}
//creates a relation with an Undefined Schema (null)
override def createRelation( “”, “” schema: StructType)
: BaseRelation = {
//implementation
return new MyCustomRelation(<>, schema)(sqlContext)
}
//gets the Schema of the table and produces a
MyCustomRelation
}
DataSource API
class MyCustomRelation(location: String,
userSchema: StructType)
(@transient val sqlContext: SQLContext)
extends BaseRelation
with Serializable {
override def schema: StructType = {
//implementation which returns
// StructType
// (or a sequence of StructFields)
}
}
}

● Limited extension
● Lack of info about partition
● Lack of Columnar and Streaming
support
DataSource API
trait LimitedScan {
def buildScan(limit: Int): RDD[Row]
}
trait PrunedLimitedScan {
def buildScan(requiredColumns: Array[String],
limit: Int): RDD[Row]
}
trait PrunedFilteredLimitedScan {
filters: Array[Filter], limit: Int): RDD[Row]
}

● Writed in Java since 2.3
● ReadSupport or
WriteSupport
● Own partitioner
● Mix-in some Support
interfaces
DataSourcev2 API
DataSourceV2
with ReadSupport
with ReadSupportWithSchema
DataSourceReader
with SupportPushdownFilters
with SupportPushdownRequiredColumns
....
InputPartitions
InputPartitionReader

DataSourcev2 API
public interface ReadSupport extends DataSourceV2 {
DataSourceReader createReader
(DataSourceOptions options);
}
public interface DataSourceReader {
StructType readSchema();
List<InputPartitions<Row>>planInputPartitions()
}
public interface SupportsPushDownRequiredColumns
extends DataSourceReader {
void pruneColumns
(StructType requiredSchema);
}
public interface InputPartition<T> {
InputPartitionReader<T>
createPartitionReader();
}
public interface InputPartitionReader<T> extends
Closeable {
boolean next();
T get();
}

● DataStax open-source
● RDDs, DataFrames and CQL
39
Spark-Cassandra-Connector

40
CassandraSourceRelation
PrunedFilteredScan InsertableRelation
BaseRelation:
● schema
● sizeInBytes
● unhandledFilters
private[cassandra] class
CassandraSourceRelation(
tableRef: TableRef,
userSpecifiedSchema: Option[StructType],
filterPushdown: Boolean,
confirmTruncate: Boolean,
tableSizeInBytes: Option[Long],
connector: CassandraConnector,
readConf: ReadConf,
writeConf: WriteConf,
sparkConf: SparkConf,
override val sqlContext:
SQLContext)
with InsertableRelation
with PrunedFilteredScan
with Logging
org.apache.spark.sql.cassandra.CassandraConn...

41
Pruned
Filtered
Scan
● Column Pruning
○ Discard columns
● Filter Pushdown
○ Discard rows

● DataSource API
● Pushdown restrictions
○ Filteringonly one column
○ Not custom index suppory
Limitations

Extensions
● Scenario
● Sampling Pushdown
○ Sample Operator
○ Changes
● Multidimensional Filter Pushdown
○ Filter Pushdown
○ Changes

44
Scenario
CREATE TABLE keyspace.table (
id double PRIMARY KEY,
x double,
y double,
z double
);
CREATE CUSTOM INDEX IF NOT
EXISTS table_idx
ON table.keyspace (x, y, z)
SELECT * from keyspace.table
WHERE x >= 0.1826763 AND x < 0.5555
AND y >= 1.9 AND y < 2.863653
AND z >= 0.1 AND z < 10.78645
A Qbeast indexed Table and Query examples:
WHERE expr(table_idx,
‘precision=0.1’)

45
Scenario
CREATE TABLE keyspace.table (
id double PRIMARY KEY,
x double,
y double,
z double
);
CREATE CUSTOM INDEX IF NOT
EXISTS table_idx
ON table.keyspace (x, y, z)
WHERE x >= 0.1826763 AND x < 0.5555
AND y >= 1.9 AND y < 2.863653
AND z >= 0.1 AND z < 10.78645
A Qbeast indexed Table and Query examples:
WHERE expr(table_idx,
‘precision=0.1’)
FILTERPUSHDOWN
SAMPLING PUSHDOWN

● Sample
○ lower/upper bound
○ with/without Replacement
○ seed
Sample Operator on Spark
TABLESAMPLE(5 ROWS)
TABLESAMPLE(10 PERCENT)
df.sample(...)

47
Sampling Pushdown
Catalyst Optimizer
DataSource API
● Filter Pushdown
● Column Pruning
● Sampling with Qbeast?
● Filter Pushdown
● Column Pruning
● Sampling Pushdown?

● New interfaces for the Scan
● New method to detect sampling
operator and Datasource
48
Sampling Pushdown
48
Pruned
Sampled
Filtered
Scan
Sampled
Pruned
Scan
DataSourceAPI
Sampled
Scan
Sampled
Filtered
Scan

@InterfaceStability.Stable
trait SampledFilteredScan {
def buildScan(filters: Array[Filter], sample:
Sample): RDD[Row]
}
trait PrunedSampledScan {
sample: Sample): RDD[Row]
}
trait SampledScan {
def buildScan(sample: Sample): RDD[Row]
}
Sampling Pushdown
trait PrunedSampledFilteredScan {
def pushSampling(sample: Sample): Boolean
filters: Array[Filter], sample: Sample): RDD[Row]
}

case s @ Sample(_, _, _, _, physical_op @ PhysicalOperation(p, f, l:
LogicalRelation)) =>
l.relation match {
case scan: PrunedSampledFilteredScan if scan.pushSampling(s) =>
pruneFilterProject(
l,
p,
f,
(a, f) => toCatalystRDD(l, a,
scan.buildScan(a.map(_.name).toArray, f, s))) :: Nil
case _ => Nil
}
Sampling Pushdown
org.apache.spark.sql.execution.datasources.DataSourceStrategy

51
Sampling Pushdown
1. User level option to pushdown sampling
2. Detection of Sample
3. Analysis
4. Write CQL expression to query the index
5. Let Qbeast handle it again!
Processing the pushdown:

Sampling Pushdown
private[cassandra] class
CassandraSourceRelation(
//other stuff
sampling: Boolean
override val sqlContext: SQLContext)
with InsertableRelation
with PrunedFilteredScan
with PrunedFilteredSampledScan
with Logging
override def pushSampling(sample: Sample): Boolean = {
//check if the table is indexed and the user wants to
pushdown the operator
}
override def buildScan
(requiredColumns: Array[String], filters: Array[Filter],
sample: Sample): RDD[Row] = {
//construct the index CQL code and push it through the
scanning
}
org.apache.spark.sql.cassandra.CassandraConn...

Sampling Pushdown
TABLESAMPLE (5 PERCENT)
Simple LookupSample(0.0,0,05,false, 983653)
Full Table Scan

55
Multidimensional Pruning
Catalyst Optimizer
DataSource API
● Filter Pushdown
● Column Pruning
● Samplingwith Qbeast
● Multidimensional pushdown?
● Filter Pushdown
● Column Pruning
● SamplingPushdown

56
1. Detect the index
2. Analyze the predicate
3. Pushdown the Filters to Cassandra
4. Let Qbeast handle it!
Processing the pushdown:

private val qbeast = table.qbeastColumns.map(_.columnName)
/** Returns the set of predicates that contains doubleranges
for the index qBeast*/
val qbeastPredicatesToPushdown: Set[Predicate] = {
val doubleRange = rangePredicatesByName.filter(p =>
p._2.exists(Predicates.isLessThanPredicate)
&&
p._2.exists(Predicates.isGreaterThanOrEqualPredicate))
if (qbeast.toSet subsetOf doubleRange.keySet) {
val eqQbeast = qbeast.flatMap(rangePredicatesByName)
eqQbeast.toSet
}
else
Set.empty
}}
val predicatesToPushDown: Set[Predicate] =
partitionKeyPredicatesToPushDown ++
clusteringColumnPredicatesToPushDown ++
indexedColumnPredicatesToPushDow ++
qbeastPredicatesToPushdown
org.apache.spark.sql.cassandra.BasicCassandraPredicateToPushdown

Multidimensional Pushdown
WHERE x >= 0.1826763 AND x < 0.5555
AND y >= 1.9 AND y < 2.863653
AND z >= 0.1 AND z < 10.78645
FILTER(isNotNull)
PrunedFilteredScan
FILTER(x,y, z, isNotNull)
Full Table Scan

Future Work
● Dimensional Aware
● Join Strategy
● Storage

● Useful for Data Locality Strategies
● Physical Planning
Dimensional Aware

● Shuffle-Hash-Join
● Broadcast-Join
● Sort-Merge-Join
66
Join Strategy in Spark

● Dimensional Aware Data Partition
● Speculative optimization on Sampling
Join on Qbeast

● Save Qbeast data in Arrow
● Static column with file information
● Make Analytics Faster
● Spark support since 2.3
Integration with Arrow

Future Work
● Dimensional Aware
● Join Strategy
● Storage
● DataSource V2

● New Java Class
● New method to detect sampling
operator and Datasource
70
DataSourceV2
70
DataSourceAPIv2
Supports
Pushdown
Sampling

package org.apache.spark.sql.sources.v2.reader;
@InterfaceStability.Evolving
public interface SupportsPushDownSampling extends
DataSourceReader {
boolean pushSampling(Sample sample);
}
DataSourceV2
case s @ Sample(_, _, _, _, l @ PhysicalOperation(p, f, e: DataSourceV2Relation)) =>
//implementation of pruning and filter pushdown
ProjectExec(p, withFilter) :: Nil
case _ => Nil
}

Extending Spark for Qbeast's SQL Data Source with Paola Pardo and Cesare Cugnasco

Extending Spark for Qbeast's SQL Data Source with Paola Pardo and Cesare Cugnasco

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Extending Spark for Qbeast's SQL Data Source with Paola Pardo and Cesare Cugnasco

Semelhante a Extending Spark for Qbeast's SQL Data Source with Paola Pardo and Cesare Cugnasco (20)

Último

Último (20)