SlideShare a Scribd company logo
1 of 38
Download to read offline
Introduction to 
SparkSQL & Catalyst 
Takuya UESHIN 
! 
ScalaMatsuri2014 2014/09/06(Sat)
Who am I? 
Takuya UESHIN 
@ueshin 
github.com/ueshin 
Nautilus Technologies, Inc. 
A Spark contributor 
2
Agenda 
What is SparkSQL? 
Catalyst in depth 
SparkSQL core in depth 
Interesting issues 
How to contribute 
3
What is SparkSQL?
What is SparkSQL? 
SparkSQL is one of 
Spark components. 
Executes SQL on Spark 
Builds SchemaRDD like LINQ 
Optimizes execution plan. 
5
What is SparkSQL? 
Catalyst provides a execution planning 
framework for relational operations. 
Including: 
SQL parser & analyzer 
Logical operators & general expressions 
Logical optimizer 
A framework to transform operator tree. 
6
What is SparkSQL? 
To execute query needs some steps. 
Parse 
Analyze 
Logical Plan 
Optimize 
Physical Plan 
Execute 
7
What is SparkSQL? 
To execute query needs some steps. 
Parse 
Analyze 
Logical Plan 
Optimize 
Physical Plan 
Execute 
8 
Catalyst
What is SparkSQL? 
To execute query needs some steps. 
Parse 
Analyze 
Logical Plan 
Optimize 
Physical Plan 
Execute 
9 
SQL core
Catalyst in depth
Catalyst in depth 
Provides a execution planning framework for 
relational operations. 
Row & DataType’s 
Trees & Rules 
Logical Operators 
Expressions 
Optimizations 
11
Row & DataType’s 
o.a.s.sql.catalyst.types.DataType 
Long, Int, Short, Byte, Float, Double, Decimal 
String, Binary, Boolean, Timestamp 
Array, Map, Struct 
o.a.s.sql.catalyst.expressions.Row 
Represents a single row. 
Can contain complex types. 
12
Trees & Rules 
o.a.s.sql.catalyst.trees.TreeNode 
Provides transformations of tree. 
foreach, map, flatMap, collect 
transform, transformUp, 
transformDown 
Used for operator tree, expression tree. 
13
Trees & Rules 
o.a.s.sql.catalyst.rules.Rule 
Represents a tree transform rule. 
o.a.s.sql.catalyst.rules.RuleExecutor 
A framework to transform trees 
based on rules. 
14
Logical Operators 
Basic Operators 
Project, Filter, … 
Binary Operators 
Join, Except, Intersect, Union, … 
Aggregate 
Generate, Distinct 
Sort, Limit 
InsertInto, WriteToFile 
15 
Project 
Filter 
Join 
Table Table
Expressions 
Literal 
Arithmetics 
UnaryMinus, Sqrt, MaxOf 
Add, Subtract, Multiply, … 
Predicates 
EqualTo, LessThan, LessThanOrEqual, GreaterThan, 
GreaterThanOrEqual 
Not, And, Or, In, If, CaseWhen 
16 
+ 
1 2
Expressions 
Cast 
GetItem, GetField 
Coalesce, IsNull, IsNotNull 
StringOperations 
Like, Upper, Lower, Contains, 
StartsWith, EndsWith, Substring, … 
17
Optimizations 
ConstantFolding 
NullPropagation 
ConstantFolding 
BooleanSimplification 
SimplifyFilters 
FilterPushdown 
CombineFilters 
PushPredicateThroughProject 
PushPredicateThroughJoin 
ColumnPruning 
18
Optimizations 
NullPropagation, ConstantFolding 
Replace expressions that can be evaluated 
with some literal value to the value. 
ex) 
1 + null => null 
1 + 2 => 3 
Count(null) => 0 
19
Optimizations 
BooleanSimplification 
Simplifies boolean expressions that can be determined. 
ex) 
false AND $right => false 
true AND $right => $right 
true OR $right => true 
false OR $right => $right 
If(true, $then, $else) => $then 
20
Optimizations 
SimplifyFilters 
Removes filters that can be evaluated 
trivially. 
ex) 
Filter(true, child) => child 
Filter(false, child) => empty 
21
Optimizations 
CombineFilters 
Merges two filters. 
ex) 
Filter($fc, Filter($nc, child)) => 
Filter(AND($fc, $nc), child) 
22
Optimizations 
PushPredicateThroughProject 
Pushes Filter operators through 
Project operator. 
ex) 
Filter(‘i === 1, Project(‘i, ‘j, child)) 
=> 
Project(‘i, ‘j, Filter(‘i === 1, child)) 
23
Optimizations 
PushPredicateThroughJoin 
Pushes Filter operators through Join 
operator. 
ex) 
Project(“left.i”.attr === 1, Join(left, 
right) => 
Join(Project(‘i === 1, left), right) 
24
Optimizations 
ColumnPruning 
Eliminates the reading of unused 
columns. 
ex) 
Join(left, right, LeftSemi, 
“left.id”.attr === “right.id”.attr) => 
Join(Project(‘id, left), right, LeftSemi) 
25
SQL core in depth
SQL core in depth 
Provides: 
Physical operators to build RDD 
Conversion from Existing RDD of Product to 
SchemaRDD support 
Parquet file read/write support 
JSON file read support 
Columnar in-memory table support 
27
SQL core in depth 
o.a.s.sql.SchemaRDD 
Extends RDD[Row]. 
Has logical plan tree. 
Provides LINQ-like interfaces to construct 
logical plan. 
select, where, join, orderBy, … 
Executes the plan. 
28
SQL core in depth 
o.a.s.sql.execution.SparkStrategies 
Converts logical plan to physical. 
Some rules are based on statistics 
of the operators. 
29
SQL core in depth 
Parquet read/write support 
Parquet: Columnar storage format for Hadoop 
Reads existing Parquet files. 
Converts Parquet schema to row schema. 
Writes new Parquet files. 
Currently DecimalType and TimestampType 
are not supported. 
30
SQL core in depth 
JSON read support 
Loads a JSON file (one object per line) 
Infers row schema from the entire 
dataset. 
Giving the schema is experimental. 
Inferring the schema by sampling is 
also experimental. 
31
SQL core in depth 
Columnar in-memory table support 
Caches table like RDD.cache, but as 
columnar style. 
Can prune unnecessary columns 
when read data. 
32
Interesting issues
Interesting issues 
Support the GroupingSet/ROLLUP/CUBE 
https://issues.apache.org/jira/browse/SPARK-2663 
Use statistics to skip partitions when 
reading from in-memory columnar 
data 
https://issues.apache.org/jira/browse/SPARK-2961 
34
Interesting issues 
Pluggable interface for shuffles 
https://issues.apache.org/jira/browse/SPARK-2044 
Sort-merge join 
https://issues.apache.org/jira/browse/SPARK-2213 
Cost-based join reordering 
https://issues.apache.org/jira/browse/SPARK-2216 
35
How to contribute
How to contribute 
See: Contributing to Spark 
Open an issue on JIRA 
Send pull-request at GitHub 
Communicate with committers and 
reviewers 
Congratulations! 
37
Conclusion 
Introduced SparkSQL & Catalyst 
Now you know them very well! 
And you know how to contribute. 
! 
Let’s contribute to Spark & SparkSQL!! 
38

More Related Content

What's hot

The Why and How of Scala at Twitter
The Why and How of Scala at TwitterThe Why and How of Scala at Twitter
The Why and How of Scala at TwitterAlex Payne
 
Scala Matsuri 2016: Japanese Text Mining with Scala and Spark
Scala Matsuri 2016: Japanese Text Mining with Scala and SparkScala Matsuri 2016: Japanese Text Mining with Scala and Spark
Scala Matsuri 2016: Japanese Text Mining with Scala and SparkEduardo Gonzalez
 
A Brief, but Dense, Intro to Scala
A Brief, but Dense, Intro to ScalaA Brief, but Dense, Intro to Scala
A Brief, but Dense, Intro to ScalaDerek Chen-Becker
 
Scala Refactoring for Fun and Profit (Japanese subtitles)
Scala Refactoring for Fun and Profit (Japanese subtitles)Scala Refactoring for Fun and Profit (Japanese subtitles)
Scala Refactoring for Fun and Profit (Japanese subtitles)Tomer Gabel
 
A Brief Intro to Scala
A Brief Intro to ScalaA Brief Intro to Scala
A Brief Intro to ScalaTim Underwood
 
Martin Odersky: What's next for Scala
Martin Odersky: What's next for ScalaMartin Odersky: What's next for Scala
Martin Odersky: What's next for ScalaMarakana Inc.
 
Scala Days San Francisco
Scala Days San FranciscoScala Days San Francisco
Scala Days San FranciscoMartin Odersky
 
Build Cloud Applications with Akka and Heroku
Build Cloud Applications with Akka and HerokuBuild Cloud Applications with Akka and Heroku
Build Cloud Applications with Akka and HerokuSalesforce Developers
 
From Ruby to Scala
From Ruby to ScalaFrom Ruby to Scala
From Ruby to Scalatod esking
 
Scala : language of the future
Scala : language of the futureScala : language of the future
Scala : language of the futureAnsviaLab
 
Martin Odersky - Evolution of Scala
Martin Odersky - Evolution of ScalaMartin Odersky - Evolution of Scala
Martin Odersky - Evolution of ScalaScala Italy
 
Introduction to Scala
Introduction to ScalaIntroduction to Scala
Introduction to ScalaSaleem Ansari
 
What's a macro?: Learning by Examples / Scalaのマクロに実用例から触れてみよう!
What's a macro?: Learning by Examples / Scalaのマクロに実用例から触れてみよう!What's a macro?: Learning by Examples / Scalaのマクロに実用例から触れてみよう!
What's a macro?: Learning by Examples / Scalaのマクロに実用例から触れてみよう!scalaconfjp
 
Scala - The Simple Parts, SFScala presentation
Scala - The Simple Parts, SFScala presentationScala - The Simple Parts, SFScala presentation
Scala - The Simple Parts, SFScala presentationMartin Odersky
 

What's hot (20)

The Why and How of Scala at Twitter
The Why and How of Scala at TwitterThe Why and How of Scala at Twitter
The Why and How of Scala at Twitter
 
Scala Matsuri 2016: Japanese Text Mining with Scala and Spark
Scala Matsuri 2016: Japanese Text Mining with Scala and SparkScala Matsuri 2016: Japanese Text Mining with Scala and Spark
Scala Matsuri 2016: Japanese Text Mining with Scala and Spark
 
A Brief, but Dense, Intro to Scala
A Brief, but Dense, Intro to ScalaA Brief, but Dense, Intro to Scala
A Brief, but Dense, Intro to Scala
 
Scala Refactoring for Fun and Profit (Japanese subtitles)
Scala Refactoring for Fun and Profit (Japanese subtitles)Scala Refactoring for Fun and Profit (Japanese subtitles)
Scala Refactoring for Fun and Profit (Japanese subtitles)
 
A Brief Intro to Scala
A Brief Intro to ScalaA Brief Intro to Scala
A Brief Intro to Scala
 
Martin Odersky: What's next for Scala
Martin Odersky: What's next for ScalaMartin Odersky: What's next for Scala
Martin Odersky: What's next for Scala
 
Scala Days San Francisco
Scala Days San FranciscoScala Days San Francisco
Scala Days San Francisco
 
Build Cloud Applications with Akka and Heroku
Build Cloud Applications with Akka and HerokuBuild Cloud Applications with Akka and Heroku
Build Cloud Applications with Akka and Heroku
 
From Ruby to Scala
From Ruby to ScalaFrom Ruby to Scala
From Ruby to Scala
 
Scala : language of the future
Scala : language of the futureScala : language of the future
Scala : language of the future
 
Quick introduction to scala
Quick introduction to scalaQuick introduction to scala
Quick introduction to scala
 
camel-scala.pdf
camel-scala.pdfcamel-scala.pdf
camel-scala.pdf
 
Scala Days NYC 2016
Scala Days NYC 2016Scala Days NYC 2016
Scala Days NYC 2016
 
Martin Odersky - Evolution of Scala
Martin Odersky - Evolution of ScalaMartin Odersky - Evolution of Scala
Martin Odersky - Evolution of Scala
 
Introduction to Scala
Introduction to ScalaIntroduction to Scala
Introduction to Scala
 
Introduction to Scala
Introduction to ScalaIntroduction to Scala
Introduction to Scala
 
What's a macro?: Learning by Examples / Scalaのマクロに実用例から触れてみよう!
What's a macro?: Learning by Examples / Scalaのマクロに実用例から触れてみよう!What's a macro?: Learning by Examples / Scalaのマクロに実用例から触れてみよう!
What's a macro?: Learning by Examples / Scalaのマクロに実用例から触れてみよう!
 
Scalax
ScalaxScalax
Scalax
 
Scala profiling
Scala profilingScala profiling
Scala profiling
 
Scala - The Simple Parts, SFScala presentation
Scala - The Simple Parts, SFScala presentationScala - The Simple Parts, SFScala presentation
Scala - The Simple Parts, SFScala presentation
 

Viewers also liked

Xitrum Web Framework Live Coding Demos / Xitrum Web Framework ライブコーディング
Xitrum Web Framework Live Coding Demos / Xitrum Web Framework ライブコーディングXitrum Web Framework Live Coding Demos / Xitrum Web Framework ライブコーディング
Xitrum Web Framework Live Coding Demos / Xitrum Web Framework ライブコーディングscalaconfjp
 
Solid And Sustainable Development in Scala
Solid And Sustainable Development in ScalaSolid And Sustainable Development in Scala
Solid And Sustainable Development in ScalaKazuhiro Sera
 
Scalable Generator: Using Scala in SIer Business (ScalaMatsuri)
Scalable Generator: Using Scala in SIer Business (ScalaMatsuri)Scalable Generator: Using Scala in SIer Business (ScalaMatsuri)
Scalable Generator: Using Scala in SIer Business (ScalaMatsuri)TIS Inc.
 
[ScalaMatsuri] グリー初のscalaプロダクト!チャットサービス公開までの苦労と工夫
[ScalaMatsuri] グリー初のscalaプロダクト!チャットサービス公開までの苦労と工夫[ScalaMatsuri] グリー初のscalaプロダクト!チャットサービス公開までの苦労と工夫
[ScalaMatsuri] グリー初のscalaプロダクト!チャットサービス公開までの苦労と工夫gree_tech
 
GitBucket: The perfect Github clone by Scala
GitBucket: The perfect Github clone by ScalaGitBucket: The perfect Github clone by Scala
GitBucket: The perfect Github clone by Scalatakezoe
 
sbt, past and future / sbt, 傾向と対策
sbt, past and future / sbt, 傾向と対策sbt, past and future / sbt, 傾向と対策
sbt, past and future / sbt, 傾向と対策scalaconfjp
 
Node.js vs Play Framework (with Japanese subtitles)
Node.js vs Play Framework (with Japanese subtitles)Node.js vs Play Framework (with Japanese subtitles)
Node.js vs Play Framework (with Japanese subtitles)Yevgeniy Brikman
 
芸者東京とScala〜おみせやさんから脳トレクエストまでの軌跡〜
芸者東京とScala〜おみせやさんから脳トレクエストまでの軌跡〜芸者東京とScala〜おみせやさんから脳トレクエストまでの軌跡〜
芸者東京とScala〜おみせやさんから脳トレクエストまでの軌跡〜scalaconfjp
 
Scala が支える医療系ウェブサービス #jissenscala
Scala が支える医療系ウェブサービス #jissenscalaScala が支える医療系ウェブサービス #jissenscala
Scala が支える医療系ウェブサービス #jissenscalaKazuhiro Sera
 
Scala@SmartNews_20150221
Scala@SmartNews_20150221Scala@SmartNews_20150221
Scala@SmartNews_20150221Shigekazu Takei
 
Scala@SmartNews AdFrontend を Scala で書いた話
Scala@SmartNews AdFrontend を Scala で書いた話Scala@SmartNews AdFrontend を Scala で書いた話
Scala@SmartNews AdFrontend を Scala で書いた話Keiji Muraishi
 
ビズリーチの新サービスをScalaで作ってみた 〜マイクロサービスの裏側 #jissenscala
ビズリーチの新サービスをScalaで作ってみた 〜マイクロサービスの裏側 #jissenscalaビズリーチの新サービスをScalaで作ってみた 〜マイクロサービスの裏側 #jissenscala
ビズリーチの新サービスをScalaで作ってみた 〜マイクロサービスの裏側 #jissenscalatakezoe
 
あなたのScalaを爆速にする7つの方法
あなたのScalaを爆速にする7つの方法あなたのScalaを爆速にする7つの方法
あなたのScalaを爆速にする7つの方法x1 ichi
 
Scala Warrior and type-safe front-end development with Scala.js
Scala Warrior and type-safe front-end development with Scala.jsScala Warrior and type-safe front-end development with Scala.js
Scala Warrior and type-safe front-end development with Scala.jstakezoe
 

Viewers also liked (14)

Xitrum Web Framework Live Coding Demos / Xitrum Web Framework ライブコーディング
Xitrum Web Framework Live Coding Demos / Xitrum Web Framework ライブコーディングXitrum Web Framework Live Coding Demos / Xitrum Web Framework ライブコーディング
Xitrum Web Framework Live Coding Demos / Xitrum Web Framework ライブコーディング
 
Solid And Sustainable Development in Scala
Solid And Sustainable Development in ScalaSolid And Sustainable Development in Scala
Solid And Sustainable Development in Scala
 
Scalable Generator: Using Scala in SIer Business (ScalaMatsuri)
Scalable Generator: Using Scala in SIer Business (ScalaMatsuri)Scalable Generator: Using Scala in SIer Business (ScalaMatsuri)
Scalable Generator: Using Scala in SIer Business (ScalaMatsuri)
 
[ScalaMatsuri] グリー初のscalaプロダクト!チャットサービス公開までの苦労と工夫
[ScalaMatsuri] グリー初のscalaプロダクト!チャットサービス公開までの苦労と工夫[ScalaMatsuri] グリー初のscalaプロダクト!チャットサービス公開までの苦労と工夫
[ScalaMatsuri] グリー初のscalaプロダクト!チャットサービス公開までの苦労と工夫
 
GitBucket: The perfect Github clone by Scala
GitBucket: The perfect Github clone by ScalaGitBucket: The perfect Github clone by Scala
GitBucket: The perfect Github clone by Scala
 
sbt, past and future / sbt, 傾向と対策
sbt, past and future / sbt, 傾向と対策sbt, past and future / sbt, 傾向と対策
sbt, past and future / sbt, 傾向と対策
 
Node.js vs Play Framework (with Japanese subtitles)
Node.js vs Play Framework (with Japanese subtitles)Node.js vs Play Framework (with Japanese subtitles)
Node.js vs Play Framework (with Japanese subtitles)
 
芸者東京とScala〜おみせやさんから脳トレクエストまでの軌跡〜
芸者東京とScala〜おみせやさんから脳トレクエストまでの軌跡〜芸者東京とScala〜おみせやさんから脳トレクエストまでの軌跡〜
芸者東京とScala〜おみせやさんから脳トレクエストまでの軌跡〜
 
Scala が支える医療系ウェブサービス #jissenscala
Scala が支える医療系ウェブサービス #jissenscalaScala が支える医療系ウェブサービス #jissenscala
Scala が支える医療系ウェブサービス #jissenscala
 
Scala@SmartNews_20150221
Scala@SmartNews_20150221Scala@SmartNews_20150221
Scala@SmartNews_20150221
 
Scala@SmartNews AdFrontend を Scala で書いた話
Scala@SmartNews AdFrontend を Scala で書いた話Scala@SmartNews AdFrontend を Scala で書いた話
Scala@SmartNews AdFrontend を Scala で書いた話
 
ビズリーチの新サービスをScalaで作ってみた 〜マイクロサービスの裏側 #jissenscala
ビズリーチの新サービスをScalaで作ってみた 〜マイクロサービスの裏側 #jissenscalaビズリーチの新サービスをScalaで作ってみた 〜マイクロサービスの裏側 #jissenscala
ビズリーチの新サービスをScalaで作ってみた 〜マイクロサービスの裏側 #jissenscala
 
あなたのScalaを爆速にする7つの方法
あなたのScalaを爆速にする7つの方法あなたのScalaを爆速にする7つの方法
あなたのScalaを爆速にする7つの方法
 
Scala Warrior and type-safe front-end development with Scala.js
Scala Warrior and type-safe front-end development with Scala.jsScala Warrior and type-safe front-end development with Scala.js
Scala Warrior and type-safe front-end development with Scala.js
 

Similar to Introduction to Spark SQL and Catalyst / Spark SQLおよびCalalystの紹介

20140908 spark sql & catalyst
20140908 spark sql & catalyst20140908 spark sql & catalyst
20140908 spark sql & catalystTakuya UESHIN
 
Spark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van HovellSpark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van HovellSpark Summit
 
Introduction to Spark - DataFactZ
Introduction to Spark - DataFactZIntroduction to Spark - DataFactZ
Introduction to Spark - DataFactZDataFactZ
 
Spark SQL In Depth www.syedacademy.com
Spark SQL In Depth www.syedacademy.comSpark SQL In Depth www.syedacademy.com
Spark SQL In Depth www.syedacademy.comSyed Hadoop
 
Spark Sql and DataFrame
Spark Sql and DataFrameSpark Sql and DataFrame
Spark Sql and DataFramePrashant Gupta
 
Apache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and PythonApache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and PythonChristian Perone
 
Hyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache SparkHyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache SparkDatabricks
 
Apache spark sneha challa- google pittsburgh-aug 25th
Apache spark  sneha challa- google pittsburgh-aug 25thApache spark  sneha challa- google pittsburgh-aug 25th
Apache spark sneha challa- google pittsburgh-aug 25thSneha Challa
 
Spark - The Ultimate Scala Collections by Martin Odersky
Spark - The Ultimate Scala Collections by Martin OderskySpark - The Ultimate Scala Collections by Martin Odersky
Spark - The Ultimate Scala Collections by Martin OderskySpark Summit
 
ScalaTo July 2019 - No more struggles with Apache Spark workloads in production
ScalaTo July 2019 - No more struggles with Apache Spark workloads in productionScalaTo July 2019 - No more struggles with Apache Spark workloads in production
ScalaTo July 2019 - No more struggles with Apache Spark workloads in productionChetan Khatri
 
Spark Streaming @ Berlin Apache Spark Meetup, March 2015
Spark Streaming @ Berlin Apache Spark Meetup, March 2015Spark Streaming @ Berlin Apache Spark Meetup, March 2015
Spark Streaming @ Berlin Apache Spark Meetup, March 2015Stratio
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupDatabricks
 
Jump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with DatabricksJump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with DatabricksAnyscale
 
Introduction to Spark Datasets - Functional and relational together at last
Introduction to Spark Datasets - Functional and relational together at lastIntroduction to Spark Datasets - Functional and relational together at last
Introduction to Spark Datasets - Functional and relational together at lastHolden Karau
 
Fast federated SQL with Apache Calcite
Fast federated SQL with Apache CalciteFast federated SQL with Apache Calcite
Fast federated SQL with Apache CalciteChris Baynes
 

Similar to Introduction to Spark SQL and Catalyst / Spark SQLおよびCalalystの紹介 (20)

20140908 spark sql & catalyst
20140908 spark sql & catalyst20140908 spark sql & catalyst
20140908 spark sql & catalyst
 
Spark sql meetup
Spark sql meetupSpark sql meetup
Spark sql meetup
 
Spark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van HovellSpark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van Hovell
 
Introduction to Spark - DataFactZ
Introduction to Spark - DataFactZIntroduction to Spark - DataFactZ
Introduction to Spark - DataFactZ
 
Meetup ml spark_ppt
Meetup ml spark_pptMeetup ml spark_ppt
Meetup ml spark_ppt
 
Spark SQL In Depth www.syedacademy.com
Spark SQL In Depth www.syedacademy.comSpark SQL In Depth www.syedacademy.com
Spark SQL In Depth www.syedacademy.com
 
Spark Sql and DataFrame
Spark Sql and DataFrameSpark Sql and DataFrame
Spark Sql and DataFrame
 
Catalyst optimizer
Catalyst optimizerCatalyst optimizer
Catalyst optimizer
 
Apache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and PythonApache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and Python
 
Hyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache SparkHyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache Spark
 
Apache spark sneha challa- google pittsburgh-aug 25th
Apache spark  sneha challa- google pittsburgh-aug 25thApache spark  sneha challa- google pittsburgh-aug 25th
Apache spark sneha challa- google pittsburgh-aug 25th
 
Dive into PySpark
Dive into PySparkDive into PySpark
Dive into PySpark
 
Spark - The Ultimate Scala Collections by Martin Odersky
Spark - The Ultimate Scala Collections by Martin OderskySpark - The Ultimate Scala Collections by Martin Odersky
Spark - The Ultimate Scala Collections by Martin Odersky
 
ScalaTo July 2019 - No more struggles with Apache Spark workloads in production
ScalaTo July 2019 - No more struggles with Apache Spark workloads in productionScalaTo July 2019 - No more struggles with Apache Spark workloads in production
ScalaTo July 2019 - No more struggles with Apache Spark workloads in production
 
Spark Streaming @ Berlin Apache Spark Meetup, March 2015
Spark Streaming @ Berlin Apache Spark Meetup, March 2015Spark Streaming @ Berlin Apache Spark Meetup, March 2015
Spark Streaming @ Berlin Apache Spark Meetup, March 2015
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark Meetup
 
Jump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with DatabricksJump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with Databricks
 
Introduction to Spark Datasets - Functional and relational together at last
Introduction to Spark Datasets - Functional and relational together at lastIntroduction to Spark Datasets - Functional and relational together at last
Introduction to Spark Datasets - Functional and relational together at last
 
Spark core
Spark coreSpark core
Spark core
 
Fast federated SQL with Apache Calcite
Fast federated SQL with Apache CalciteFast federated SQL with Apache Calcite
Fast federated SQL with Apache Calcite
 

More from scalaconfjp

脆弱性対策のためのClean Architecture ~脆弱性に対するレジリエンスを確保せよ~
脆弱性対策のためのClean Architecture ~脆弱性に対するレジリエンスを確保せよ~脆弱性対策のためのClean Architecture ~脆弱性に対するレジリエンスを確保せよ~
脆弱性対策のためのClean Architecture ~脆弱性に対するレジリエンスを確保せよ~scalaconfjp
 
Alp x BizReach SaaS事業を営む2社がお互い気になることをゆるゆる聞いてみる会
Alp x BizReach SaaS事業を営む2社がお互い気になることをゆるゆる聞いてみる会Alp x BizReach SaaS事業を営む2社がお互い気になることをゆるゆる聞いてみる会
Alp x BizReach SaaS事業を営む2社がお互い気になることをゆるゆる聞いてみる会scalaconfjp
 
GraalVM Overview Compact version
GraalVM Overview Compact versionGraalVM Overview Compact version
GraalVM Overview Compact versionscalaconfjp
 
Run Scala Faster with GraalVM on any Platform / GraalVMで、どこでもScalaを高速実行しよう by...
Run Scala Faster with GraalVM on any Platform / GraalVMで、どこでもScalaを高速実行しよう by...Run Scala Faster with GraalVM on any Platform / GraalVMで、どこでもScalaを高速実行しよう by...
Run Scala Faster with GraalVM on any Platform / GraalVMで、どこでもScalaを高速実行しよう by...scalaconfjp
 
Monitoring Reactive Architecture Like Never Before / 今までになかったリアクティブアーキテクチャの監視...
Monitoring Reactive Architecture Like Never Before / 今までになかったリアクティブアーキテクチャの監視...Monitoring Reactive Architecture Like Never Before / 今までになかったリアクティブアーキテクチャの監視...
Monitoring Reactive Architecture Like Never Before / 今までになかったリアクティブアーキテクチャの監視...scalaconfjp
 
Scala 3, what does it means for me? / Scala 3って、私にはどんな影響があるの? by Joan Goyeau
Scala 3, what does it means for me? / Scala 3って、私にはどんな影響があるの? by Joan GoyeauScala 3, what does it means for me? / Scala 3って、私にはどんな影響があるの? by Joan Goyeau
Scala 3, what does it means for me? / Scala 3って、私にはどんな影響があるの? by Joan Goyeauscalaconfjp
 
Functional Object-Oriented Imperative Scala / 関数型オブジェクト指向命令型 Scala by Sébasti...
Functional Object-Oriented Imperative Scala / 関数型オブジェクト指向命令型 Scala by Sébasti...Functional Object-Oriented Imperative Scala / 関数型オブジェクト指向命令型 Scala by Sébasti...
Functional Object-Oriented Imperative Scala / 関数型オブジェクト指向命令型 Scala by Sébasti...scalaconfjp
 
Scala ♥ Graal by Flavio Brasil
Scala ♥ Graal by Flavio BrasilScala ♥ Graal by Flavio Brasil
Scala ♥ Graal by Flavio Brasilscalaconfjp
 
Introduction to GraphQL in Scala
Introduction to GraphQL in ScalaIntroduction to GraphQL in Scala
Introduction to GraphQL in Scalascalaconfjp
 
Safety Beyond Types
Safety Beyond TypesSafety Beyond Types
Safety Beyond Typesscalaconfjp
 
Reactive Kafka with Akka Streams
Reactive Kafka with Akka StreamsReactive Kafka with Akka Streams
Reactive Kafka with Akka Streamsscalaconfjp
 
Reactive microservices with play and akka
Reactive microservices with play and akkaReactive microservices with play and akka
Reactive microservices with play and akkascalaconfjp
 
Scalaに対して意識の低いエンジニアがScalaで何したかの話, by 芸者東京エンターテインメント
Scalaに対して意識の低いエンジニアがScalaで何したかの話, by 芸者東京エンターテインメントScalaに対して意識の低いエンジニアがScalaで何したかの話, by 芸者東京エンターテインメント
Scalaに対して意識の低いエンジニアがScalaで何したかの話, by 芸者東京エンターテインメントscalaconfjp
 
DWANGO by ドワンゴ
DWANGO by ドワンゴDWANGO by ドワンゴ
DWANGO by ドワンゴscalaconfjp
 
OCTOPARTS by M3, Inc.
OCTOPARTS by M3, Inc.OCTOPARTS by M3, Inc.
OCTOPARTS by M3, Inc.scalaconfjp
 
Try using Aeromock by Marverick, Inc.
Try using Aeromock by Marverick, Inc.Try using Aeromock by Marverick, Inc.
Try using Aeromock by Marverick, Inc.scalaconfjp
 
統計をとって高速化する
Scala開発 by CyberZ,Inc.
統計をとって高速化する
Scala開発 by CyberZ,Inc.統計をとって高速化する
Scala開発 by CyberZ,Inc.
統計をとって高速化する
Scala開発 by CyberZ,Inc.scalaconfjp
 
Short Introduction of Implicit Conversion by TIS, Inc.
Short Introduction of Implicit Conversion by TIS, Inc.Short Introduction of Implicit Conversion by TIS, Inc.
Short Introduction of Implicit Conversion by TIS, Inc.scalaconfjp
 
ビズリーチ x ScalaMatsuri by BIZREACH, Inc.
ビズリーチ x ScalaMatsuri  by BIZREACH, Inc.ビズリーチ x ScalaMatsuri  by BIZREACH, Inc.
ビズリーチ x ScalaMatsuri by BIZREACH, Inc.scalaconfjp
 
Solid and Sustainable Development in Scala
Solid and Sustainable Development in ScalaSolid and Sustainable Development in Scala
Solid and Sustainable Development in Scalascalaconfjp
 

More from scalaconfjp (20)

脆弱性対策のためのClean Architecture ~脆弱性に対するレジリエンスを確保せよ~
脆弱性対策のためのClean Architecture ~脆弱性に対するレジリエンスを確保せよ~脆弱性対策のためのClean Architecture ~脆弱性に対するレジリエンスを確保せよ~
脆弱性対策のためのClean Architecture ~脆弱性に対するレジリエンスを確保せよ~
 
Alp x BizReach SaaS事業を営む2社がお互い気になることをゆるゆる聞いてみる会
Alp x BizReach SaaS事業を営む2社がお互い気になることをゆるゆる聞いてみる会Alp x BizReach SaaS事業を営む2社がお互い気になることをゆるゆる聞いてみる会
Alp x BizReach SaaS事業を営む2社がお互い気になることをゆるゆる聞いてみる会
 
GraalVM Overview Compact version
GraalVM Overview Compact versionGraalVM Overview Compact version
GraalVM Overview Compact version
 
Run Scala Faster with GraalVM on any Platform / GraalVMで、どこでもScalaを高速実行しよう by...
Run Scala Faster with GraalVM on any Platform / GraalVMで、どこでもScalaを高速実行しよう by...Run Scala Faster with GraalVM on any Platform / GraalVMで、どこでもScalaを高速実行しよう by...
Run Scala Faster with GraalVM on any Platform / GraalVMで、どこでもScalaを高速実行しよう by...
 
Monitoring Reactive Architecture Like Never Before / 今までになかったリアクティブアーキテクチャの監視...
Monitoring Reactive Architecture Like Never Before / 今までになかったリアクティブアーキテクチャの監視...Monitoring Reactive Architecture Like Never Before / 今までになかったリアクティブアーキテクチャの監視...
Monitoring Reactive Architecture Like Never Before / 今までになかったリアクティブアーキテクチャの監視...
 
Scala 3, what does it means for me? / Scala 3って、私にはどんな影響があるの? by Joan Goyeau
Scala 3, what does it means for me? / Scala 3って、私にはどんな影響があるの? by Joan GoyeauScala 3, what does it means for me? / Scala 3って、私にはどんな影響があるの? by Joan Goyeau
Scala 3, what does it means for me? / Scala 3って、私にはどんな影響があるの? by Joan Goyeau
 
Functional Object-Oriented Imperative Scala / 関数型オブジェクト指向命令型 Scala by Sébasti...
Functional Object-Oriented Imperative Scala / 関数型オブジェクト指向命令型 Scala by Sébasti...Functional Object-Oriented Imperative Scala / 関数型オブジェクト指向命令型 Scala by Sébasti...
Functional Object-Oriented Imperative Scala / 関数型オブジェクト指向命令型 Scala by Sébasti...
 
Scala ♥ Graal by Flavio Brasil
Scala ♥ Graal by Flavio BrasilScala ♥ Graal by Flavio Brasil
Scala ♥ Graal by Flavio Brasil
 
Introduction to GraphQL in Scala
Introduction to GraphQL in ScalaIntroduction to GraphQL in Scala
Introduction to GraphQL in Scala
 
Safety Beyond Types
Safety Beyond TypesSafety Beyond Types
Safety Beyond Types
 
Reactive Kafka with Akka Streams
Reactive Kafka with Akka StreamsReactive Kafka with Akka Streams
Reactive Kafka with Akka Streams
 
Reactive microservices with play and akka
Reactive microservices with play and akkaReactive microservices with play and akka
Reactive microservices with play and akka
 
Scalaに対して意識の低いエンジニアがScalaで何したかの話, by 芸者東京エンターテインメント
Scalaに対して意識の低いエンジニアがScalaで何したかの話, by 芸者東京エンターテインメントScalaに対して意識の低いエンジニアがScalaで何したかの話, by 芸者東京エンターテインメント
Scalaに対して意識の低いエンジニアがScalaで何したかの話, by 芸者東京エンターテインメント
 
DWANGO by ドワンゴ
DWANGO by ドワンゴDWANGO by ドワンゴ
DWANGO by ドワンゴ
 
OCTOPARTS by M3, Inc.
OCTOPARTS by M3, Inc.OCTOPARTS by M3, Inc.
OCTOPARTS by M3, Inc.
 
Try using Aeromock by Marverick, Inc.
Try using Aeromock by Marverick, Inc.Try using Aeromock by Marverick, Inc.
Try using Aeromock by Marverick, Inc.
 
統計をとって高速化する
Scala開発 by CyberZ,Inc.
統計をとって高速化する
Scala開発 by CyberZ,Inc.統計をとって高速化する
Scala開発 by CyberZ,Inc.
統計をとって高速化する
Scala開発 by CyberZ,Inc.
 
Short Introduction of Implicit Conversion by TIS, Inc.
Short Introduction of Implicit Conversion by TIS, Inc.Short Introduction of Implicit Conversion by TIS, Inc.
Short Introduction of Implicit Conversion by TIS, Inc.
 
ビズリーチ x ScalaMatsuri by BIZREACH, Inc.
ビズリーチ x ScalaMatsuri  by BIZREACH, Inc.ビズリーチ x ScalaMatsuri  by BIZREACH, Inc.
ビズリーチ x ScalaMatsuri by BIZREACH, Inc.
 
Solid and Sustainable Development in Scala
Solid and Sustainable Development in ScalaSolid and Sustainable Development in Scala
Solid and Sustainable Development in Scala
 

Recently uploaded

why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 

Recently uploaded (20)

why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 

Introduction to Spark SQL and Catalyst / Spark SQLおよびCalalystの紹介

  • 1. Introduction to SparkSQL & Catalyst Takuya UESHIN ! ScalaMatsuri2014 2014/09/06(Sat)
  • 2. Who am I? Takuya UESHIN @ueshin github.com/ueshin Nautilus Technologies, Inc. A Spark contributor 2
  • 3. Agenda What is SparkSQL? Catalyst in depth SparkSQL core in depth Interesting issues How to contribute 3
  • 5. What is SparkSQL? SparkSQL is one of Spark components. Executes SQL on Spark Builds SchemaRDD like LINQ Optimizes execution plan. 5
  • 6. What is SparkSQL? Catalyst provides a execution planning framework for relational operations. Including: SQL parser & analyzer Logical operators & general expressions Logical optimizer A framework to transform operator tree. 6
  • 7. What is SparkSQL? To execute query needs some steps. Parse Analyze Logical Plan Optimize Physical Plan Execute 7
  • 8. What is SparkSQL? To execute query needs some steps. Parse Analyze Logical Plan Optimize Physical Plan Execute 8 Catalyst
  • 9. What is SparkSQL? To execute query needs some steps. Parse Analyze Logical Plan Optimize Physical Plan Execute 9 SQL core
  • 11. Catalyst in depth Provides a execution planning framework for relational operations. Row & DataType’s Trees & Rules Logical Operators Expressions Optimizations 11
  • 12. Row & DataType’s o.a.s.sql.catalyst.types.DataType Long, Int, Short, Byte, Float, Double, Decimal String, Binary, Boolean, Timestamp Array, Map, Struct o.a.s.sql.catalyst.expressions.Row Represents a single row. Can contain complex types. 12
  • 13. Trees & Rules o.a.s.sql.catalyst.trees.TreeNode Provides transformations of tree. foreach, map, flatMap, collect transform, transformUp, transformDown Used for operator tree, expression tree. 13
  • 14. Trees & Rules o.a.s.sql.catalyst.rules.Rule Represents a tree transform rule. o.a.s.sql.catalyst.rules.RuleExecutor A framework to transform trees based on rules. 14
  • 15. Logical Operators Basic Operators Project, Filter, … Binary Operators Join, Except, Intersect, Union, … Aggregate Generate, Distinct Sort, Limit InsertInto, WriteToFile 15 Project Filter Join Table Table
  • 16. Expressions Literal Arithmetics UnaryMinus, Sqrt, MaxOf Add, Subtract, Multiply, … Predicates EqualTo, LessThan, LessThanOrEqual, GreaterThan, GreaterThanOrEqual Not, And, Or, In, If, CaseWhen 16 + 1 2
  • 17. Expressions Cast GetItem, GetField Coalesce, IsNull, IsNotNull StringOperations Like, Upper, Lower, Contains, StartsWith, EndsWith, Substring, … 17
  • 18. Optimizations ConstantFolding NullPropagation ConstantFolding BooleanSimplification SimplifyFilters FilterPushdown CombineFilters PushPredicateThroughProject PushPredicateThroughJoin ColumnPruning 18
  • 19. Optimizations NullPropagation, ConstantFolding Replace expressions that can be evaluated with some literal value to the value. ex) 1 + null => null 1 + 2 => 3 Count(null) => 0 19
  • 20. Optimizations BooleanSimplification Simplifies boolean expressions that can be determined. ex) false AND $right => false true AND $right => $right true OR $right => true false OR $right => $right If(true, $then, $else) => $then 20
  • 21. Optimizations SimplifyFilters Removes filters that can be evaluated trivially. ex) Filter(true, child) => child Filter(false, child) => empty 21
  • 22. Optimizations CombineFilters Merges two filters. ex) Filter($fc, Filter($nc, child)) => Filter(AND($fc, $nc), child) 22
  • 23. Optimizations PushPredicateThroughProject Pushes Filter operators through Project operator. ex) Filter(‘i === 1, Project(‘i, ‘j, child)) => Project(‘i, ‘j, Filter(‘i === 1, child)) 23
  • 24. Optimizations PushPredicateThroughJoin Pushes Filter operators through Join operator. ex) Project(“left.i”.attr === 1, Join(left, right) => Join(Project(‘i === 1, left), right) 24
  • 25. Optimizations ColumnPruning Eliminates the reading of unused columns. ex) Join(left, right, LeftSemi, “left.id”.attr === “right.id”.attr) => Join(Project(‘id, left), right, LeftSemi) 25
  • 26. SQL core in depth
  • 27. SQL core in depth Provides: Physical operators to build RDD Conversion from Existing RDD of Product to SchemaRDD support Parquet file read/write support JSON file read support Columnar in-memory table support 27
  • 28. SQL core in depth o.a.s.sql.SchemaRDD Extends RDD[Row]. Has logical plan tree. Provides LINQ-like interfaces to construct logical plan. select, where, join, orderBy, … Executes the plan. 28
  • 29. SQL core in depth o.a.s.sql.execution.SparkStrategies Converts logical plan to physical. Some rules are based on statistics of the operators. 29
  • 30. SQL core in depth Parquet read/write support Parquet: Columnar storage format for Hadoop Reads existing Parquet files. Converts Parquet schema to row schema. Writes new Parquet files. Currently DecimalType and TimestampType are not supported. 30
  • 31. SQL core in depth JSON read support Loads a JSON file (one object per line) Infers row schema from the entire dataset. Giving the schema is experimental. Inferring the schema by sampling is also experimental. 31
  • 32. SQL core in depth Columnar in-memory table support Caches table like RDD.cache, but as columnar style. Can prune unnecessary columns when read data. 32
  • 34. Interesting issues Support the GroupingSet/ROLLUP/CUBE https://issues.apache.org/jira/browse/SPARK-2663 Use statistics to skip partitions when reading from in-memory columnar data https://issues.apache.org/jira/browse/SPARK-2961 34
  • 35. Interesting issues Pluggable interface for shuffles https://issues.apache.org/jira/browse/SPARK-2044 Sort-merge join https://issues.apache.org/jira/browse/SPARK-2213 Cost-based join reordering https://issues.apache.org/jira/browse/SPARK-2216 35
  • 37. How to contribute See: Contributing to Spark Open an issue on JIRA Send pull-request at GitHub Communicate with committers and reviewers Congratulations! 37
  • 38. Conclusion Introduced SparkSQL & Catalyst Now you know them very well! And you know how to contribute. ! Let’s contribute to Spark & SparkSQL!! 38