Apache kylin - Big Data Technology Conference 2014 Beijing

Luke Han
Luke HanCo-Founder & CEO at Kyligence Inc. em Kyligence
http://kylin.io
Apache Kylin Introduction
Dec 14, 2014
韩卿|Luke Han
Co-creator of Apache Kylin | lukehan@apache.org
Sr. Product Manager, eBay CCOE
http://kylin.io
Agenda
 What’s Apache Kylin?
 Feature & Tech Highlights
 Performance
 Open Source & Roadmap
 Q & A
http://kylin.io
Extreme OLAP Engine for Big Data
Kylin is an open source Distributed Analytics Engine from eBay
that provides SQL interface and multi-dimensional analysis
(OLAP) on Hadoop supporting extremely large datasets
What’s Kylin
kylin / ˈkiːˈlɪn / 麒麟
--n. (in Chinese art) a mythical animal of composite form
• Open Sourced on Oct 1st, 2014
• Be accepted as Apache Incubator Project on Nov 25th, 2014
http://kylin.io
Big Data Era
 More and more data becoming available on Hadoop
 Limitations in existing Business Intelligence (BI) Tools
 Limited support for Hadoop
 Data size growing exponentially
 High latency of interactive queries
 Scale-Up architecture
 Challenges to adopt Hadoop as interactive analysis system
 Majority of analyst groups are SQL savvy
 No mature SQL interface on Hadoop
 OLAP capability on Hadoop ecosystem not ready yet
http://kylin.io
Business Needs for Big Data Analysis
 Sub-second query latency on billions of rows
 ANSI SQL for both analysts and engineers
 Full OLAP capability to offer advanced functionality
 Seamless Integration with BI Tools
 Support of high cardinality and high dimensions
 High concurrency – thousands of end users
 Distributed and scale out architecture for large data volume
http://kylin.io6
Why not
Build an engine from scratch?
http://kylin.io
Transaction
Operation
Strategy
High Level
Aggregation
•Very High Level, e.g GMV by
site by vertical by weeks
Analysis
Query
•Middle level, e.g GMV by site by vertical,
by category (level x) past 12 weeks
Drill Down
to Detail
•Detail Level (Summary Table)
Low Level
Aggregation
•First Level
Aggragation
Transaction
Level
•Transaction Data
Analytics Query Taxonomy
OLAP
Kylin is designed to accelerate 80+% analytics queries performance on Hadoop
OLTP
http://kylin.io
 Huge volume data
 Table scan
 Big table joins
 Data shuffling
 Analysis on different granularity
 Runtime aggregation expensive
 Map Reduce job
 Batch processing
Technical Challenges
http://kylin.io
OLAP Cube – Balance between Space and Time
time, item
time, item, location
time, item, location, supplier
time item location supplier
time, location
Time, supplier
item, location
item, supplier
location, supplier
time, item, supplier
time, location, supplier
item, location, supplier
0-D(apex) cuboid
1-D cuboids
2-D cuboids
3-D cuboids
4-D(base) cuboid
• Base vs. aggregate cells; ancestor vs. descendant cells; parent vs. child cells
1. (9/15, milk, Urbana, Dairy_land) - <time, item, location, supplier>
2. (9/15, milk, Urbana, *) - <time, item, location>
3. (*, milk, Urbana, *) - <item, location>
4. (*, milk, Chicago, *) - <item, location>
5. (*, milk, *, *) - <item>
• Cuboid = one combination of dimensions
• Cube = all combination of dimensions (all cuboids)
http://kylin.io
From Relational to Key-Value
http://kylin.io
Kylin Architecture Overview
11
Cube Build Engine
(MapReduce…)
SQL
Low Latency -
Seconds
Mid Latency - Minutes
Routing
3rd Party App
(Web App, Mobile…)
Metadata
SQL-Based Tool
(BI Tools: Tableau…)
Query Engine
Hadoop
Hive
REST API JDBC/ODBC
 Online Analysis Data Flow
 Offline Data Flow
 Clients/Users interactive with
Kylin via SQL
 OLAP Cube is transparent to
users
Star Schema Data Key Value Data
Data
Cube
OLAP
Cube
(HBase)
SQL
REST Server
http://kylin.io
 Hive
 Input source
 Pre-join star schema during cube building
 MapReduce
 Pre-aggregation metrics during cube building
 HDFS
 Store intermediated files during cube building.
 HBase
 Store data cube.
 Serve query on data cube.
 Coprocessor is used for query processing.
How Does Kylin Utilize Hadoop Components?
http://kylin.io
Agenda
 What’s Apache Kylin?
 Feature & Tech Highlights
 Performance
 Open Source & Roadmap
 Q & A
http://kylin.io
 Extremely Fast OLAP Engine at Scale
Kylin is designed to reduce query latency on Hadoop for 10+ billions of rows of data
 ANSI SQL Interface on Hadoop
Kylin offers ANSI SQL on Hadoop and supports most ANSI SQL query functions
 Seamless Integration with BI Tools
Kylin currently offers integration capability with BI Tools like Tableau.
 Interactive Query Capability
Users can interact with Hadoop data via Kylin at sub-second latency, better than Hive
queries for the same dataset
 MOLAP Cube
User can define a data model and pre-build in Kylin with more than 10+ billions of raw
data records
Features Highlights
http://kylin.io
 Compression and Encoding Support
 Incremental Refresh of Cubes
 Approximate Query Capability for distinct Count (HyperLogLog)
 Leverage HBase Coprocessor for query latency
 Job Management and Monitoring
 Easy Web interface to manage, build, monitor and query cubes
 Security capability to set ACL at Cube/Project Level
 Support LDAP Integration
Features Highlights…
http://kylin.io
Cube Designer
http://kylin.io
Job Management
http://kylin.io
Query and Visualization
http://kylin.io
Tableau Integration
http://kylin.io
Data Modeling
Cube: …
Fact Table: …
Dimensions: …
Measures: …
Storage(HBase): …Fact
Dim Dim
Dim
Source
Star Schema
row A
row B
row C
Column Family
Val 1
Val 2
Val 3
Row Key Column
Target
HBase Storage
Mapping
Cube Metadata
End User Cube Modeler Admin
http://kylin.io
Cube Build Job Flow
http://kylin.io
How To Store Cube? – HBase Schema
http://kylin.io
Query Engine – Calcite (Optiq)
 Dynamic data management framework.
 Formerly known as Optiq, Calcite is an Apache incubator project, used by
Apache Drill and Apache Hive, among others.
 http://optiq.incubator.apache.org
http://kylin.io
Query Engine – Kylin Explain Plan
SELECT test_cal_dt.week_beg_dt, test_category.category_name, test_category.lvl2_name, test_category.lvl3_name,
test_kylin_fact.lstg_format_name, test_sites.site_name, SUM(test_kylin_fact.price) AS GMV, COUNT(*) AS TRANS_CNT
FROM test_kylin_fact
LEFT JOIN test_cal_dt ON test_kylin_fact.cal_dt = test_cal_dt.cal_dt
LEFT JOIN test_category ON test_kylin_fact.leaf_categ_id = test_category.leaf_categ_id AND test_kylin_fact.lstg_site_id =
test_category.site_id
LEFT JOIN test_sites ON test_kylin_fact.lstg_site_id = test_sites.site_id
WHERE test_kylin_fact.seller_id = 123456OR test_kylin_fact.lstg_format_name = ’New'
GROUP BY test_cal_dt.week_beg_dt, test_category.category_name, test_category.lvl2_name, test_category.lvl3_name,
test_kylin_fact.lstg_format_name,test_sites.site_name
OLAPToEnumerableConverter
OLAPProjectRel(WEEK_BEG_DT=[$0], category_name=[$1], CATEG_LVL2_NAME=[$2], CATEG_LVL3_NAME=[$3],
LSTG_FORMAT_NAME=[$4], SITE_NAME=[$5], GMV=[CASE(=($7, 0), null, $6)], TRANS_CNT=[$8])
OLAPAggregateRel(group=[{0, 1, 2, 3, 4, 5}], agg#0=[$SUM0($6)], agg#1=[COUNT($6)], TRANS_CNT=[COUNT()])
OLAPProjectRel(WEEK_BEG_DT=[$13], category_name=[$21], CATEG_LVL2_NAME=[$15], CATEG_LVL3_NAME=[$14],
LSTG_FORMAT_NAME=[$5], SITE_NAME=[$23], PRICE=[$0])
OLAPFilterRel(condition=[OR(=($3, 123456), =($5, ’New'))])
OLAPJoinRel(condition=[=($2, $25)], joinType=[left])
OLAPJoinRel(condition=[AND(=($6, $22), =($2, $17))], joinType=[left])
OLAPJoinRel(condition=[=($4, $12)], joinType=[left])
OLAPTableScan(table=[[DEFAULT, TEST_KYLIN_FACT]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]])
OLAPTableScan(table=[[DEFAULT, TEST_CAL_DT]], fields=[[0, 1]])
OLAPTableScan(table=[[DEFAULT, test_category]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]])
OLAPTableScan(table=[[DEFAULT, TEST_SITES]], fields=[[0, 1, 2]])
http://kylin.io
 Full Cube
 Pre-aggregate all dimension combinations
 “Curse of dimensionality”: N dimension cube has 2N cuboid.
 Partial Cube
 To avoid dimension explosion, we divide the dimensions into
different aggregation groups
 2N+M+L  2N + 2M + 2L
 For cube with 30 dimensions, if we divide these dimensions into 3
group, the cuboid number will reduce from 1 Billion to 3 Thousands
 230  210 + 210 + 210
 Tradeoff between online aggregation and offline pre-aggregation
How To Optimize Cube? – Full Cube vs. Partial
Cube
http://kylin.io
How To Optimize Cube? – Partial Cube
http://kylin.io
How To Optimize Cube? – Incremental Building
http://kylin.io
Inverted Index
 Challenge
 Has no raw data records
 Slow table scan on high cardinality dimensions
 Inverted Index Storage (an ongoing effort)
 Persist the raw table
 Bitmap inverted index
 Time range partition
 In-memory (block cache)
 Parallel scan (endpoint coprocessor)
http://kylin.io
Agenda
 What’s Apache Kylin?
 Feature & Tech Highlights
 Performance
 Open Source & Roadmap
 Q & A
http://kylin.io
Kylin vs. Hive
# Query
Type
Return Dataset Query
On Kylin (s)
Query
On Hive (s)
Comments
1 High Level
Aggregation
4 0.129 157.437 1,217 times
2 Analysis Query 22,669 1.615 109.206 68 times
3 Drill Down to
Detail
325,029 12.058 113.123 9 times
4 Drill Down to
Detail
524,780 22.42 6383.21 278 times
5 Data Dump 972,002 49.054 N/A
0
50
100
150
200
SQL #1 SQL #2 SQL #3
Hive
Kylin
High Level
Aggregatio
n
Analysis
Query
Drill Down
to Detail
Low Level
Aggregatio
n
Transactio
n Level
Based on 12+B records case
http://kylin.io
Performance -- Concurrency
Linear scale out with more nodes
http://kylin.io
Performance - Query Latency
90%tile queries <5s
Green Line: 90%tile queries
Gray Line: 95%tile queries
http://kylin.io
Agenda
 What’s Apache Kylin?
 Feature & Tech Highlights
 Performance
 Open Source & Roadmap
 Q & A
http://kylin.io
 Kylin Core
 Fundamental framework of
Kylin OLAP Engine
 Extension
 Plugins to support for
additional functions and
features
 Integration
 Lifecycle Management
Support to integrate with
other applications
 Interface
 Allows for third party users to
build more features via user-
interface atop Kylin core
 Driver
 ODBC and JDBC Drivers
Kylin OLAP
Core
Extension
 Security
 Redis Storage
 Spark Engine
 Docker
Interface
 Web Console
 Customized BI
 Ambari/Hue Plugin
Integration
 ODBC Driver
 ETL
 Drill
Kylin Ecosystem
http://kylin.io
Kylin Evolution Roadmap
201520142013
Initial
Prototype
for MOLAP
• Basic end to end
POC
MOLAP
• Incremental
Refresh
• ANSI SQL
• ODBC Driver
• Web GUI
• ACL
• Open Source
HOLAP
• InvertedIndex
• JDBC Driver
• Automation
• Capacity
Management
• Support Spark
• … more
Next Gen
• In-Memory
Analysis
• Real Time &
Streaming (TBD)
• … more
TBD
Future…
Sep, 2013
Jan, 2014
Sep, 2014
Q1, 2015
http://kylin.io
 Kylin Site:
 http://kylin.io
 Twitter:
 @ApacheKylin
 Source Code Repo:
 https://github.com/KylinOLAP
 Google Group:
 Kylin OLAP
Open Source
http://kylin.io
 时间:
 2014-12-14 6:30 PM – 9:00 PM
 地点:
 3W咖啡
Apache Kylin 北京线下交流会
http://kylin.io
Thanks
http://kylin.io
lukehan@apache.org
1 de 38

Recomendados

Apache Kylin Open Source Journey for QCon2015 Beijing por
Apache Kylin Open Source Journey for QCon2015 BeijingApache Kylin Open Source Journey for QCon2015 Beijing
Apache Kylin Open Source Journey for QCon2015 BeijingLuke Han
1.2K visualizações42 slides
Kylin OLAP Engine Tour por
Kylin OLAP Engine TourKylin OLAP Engine Tour
Kylin OLAP Engine TourLuke Han
5.8K visualizações28 slides
Kylin olap part 1- getting started por
Kylin olap   part 1- getting startedKylin olap   part 1- getting started
Kylin olap part 1- getting startedShubham Shirude
687 visualizações21 slides
Apache Kylin Introduction por
Apache Kylin IntroductionApache Kylin Introduction
Apache Kylin IntroductionLuke Han
1.8K visualizações35 slides
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai por
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @ShanghaiLuke Han
1.6K visualizações10 slides
Apache Kylin Extreme OLAP Engine for Big Data por
Apache Kylin Extreme OLAP Engine for Big DataApache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Extreme OLAP Engine for Big DataLuke Han
2.9K visualizações34 slides

Mais conteúdo relacionado

Mais procurados

Adding Spark support to Kylin at Bay Area Spark Meetup por
Adding Spark support to Kylin at Bay Area Spark MeetupAdding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark MeetupLuke Han
1.4K visualizações8 slides
Apache Kylin: Hadoop OLAP Engine, 2014 Dec por
Apache Kylin: Hadoop OLAP Engine, 2014 DecApache Kylin: Hadoop OLAP Engine, 2014 Dec
Apache Kylin: Hadoop OLAP Engine, 2014 DecYang Li
4.4K visualizações42 slides
Apache kylin (china hadoop summit 2015 shanghai) por
Apache kylin (china hadoop summit 2015 shanghai)Apache kylin (china hadoop summit 2015 shanghai)
Apache kylin (china hadoop summit 2015 shanghai)qhzhou
1.1K visualizações38 slides
Big Data MDX with Mondrian and Apache Kylin por
Big Data MDX with Mondrian and Apache KylinBig Data MDX with Mondrian and Apache Kylin
Big Data MDX with Mondrian and Apache Kylininovex GmbH
3.6K visualizações30 slides
The Apache Way - Building Open Source Community in China - Luke Han por
The Apache Way - Building Open Source Community in China - Luke HanThe Apache Way - Building Open Source Community in China - Luke Han
The Apache Way - Building Open Source Community in China - Luke HanLuke Han
794 visualizações36 slides
Apache Kylin Streaming por
Apache Kylin Streaming Apache Kylin Streaming
Apache Kylin Streaming hongbin ma
1.6K visualizações41 slides

Mais procurados(20)

Adding Spark support to Kylin at Bay Area Spark Meetup por Luke Han
Adding Spark support to Kylin at Bay Area Spark MeetupAdding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark Meetup
Luke Han1.4K visualizações
Apache Kylin: Hadoop OLAP Engine, 2014 Dec por Yang Li
Apache Kylin: Hadoop OLAP Engine, 2014 DecApache Kylin: Hadoop OLAP Engine, 2014 Dec
Apache Kylin: Hadoop OLAP Engine, 2014 Dec
Yang Li4.4K visualizações
Apache kylin (china hadoop summit 2015 shanghai) por qhzhou
Apache kylin (china hadoop summit 2015 shanghai)Apache kylin (china hadoop summit 2015 shanghai)
Apache kylin (china hadoop summit 2015 shanghai)
qhzhou1.1K visualizações
Big Data MDX with Mondrian and Apache Kylin por inovex GmbH
Big Data MDX with Mondrian and Apache KylinBig Data MDX with Mondrian and Apache Kylin
Big Data MDX with Mondrian and Apache Kylin
inovex GmbH3.6K visualizações
The Apache Way - Building Open Source Community in China - Luke Han por Luke Han
The Apache Way - Building Open Source Community in China - Luke HanThe Apache Way - Building Open Source Community in China - Luke Han
The Apache Way - Building Open Source Community in China - Luke Han
Luke Han794 visualizações
Apache Kylin Streaming por hongbin ma
Apache Kylin Streaming Apache Kylin Streaming
Apache Kylin Streaming
hongbin ma1.6K visualizações
Kylin Engineering Principles por Xu Jiang
Kylin Engineering PrinciplesKylin Engineering Principles
Kylin Engineering Principles
Xu Jiang1.3K visualizações
Apache Kylin on HBase: Extreme OLAP engine for big data por Shi Shao Feng
Apache Kylin on HBase: Extreme OLAP engine for big dataApache Kylin on HBase: Extreme OLAP engine for big data
Apache Kylin on HBase: Extreme OLAP engine for big data
Shi Shao Feng1.6K visualizações
Apache Kylin’s Performance Boost from Apache HBase por HBaseCon
Apache Kylin’s Performance Boost from Apache HBaseApache Kylin’s Performance Boost from Apache HBase
Apache Kylin’s Performance Boost from Apache HBase
HBaseCon3.5K visualizações
Apache Kylin Use Cases in China and Japan por Luke Han
Apache Kylin Use Cases in China and JapanApache Kylin Use Cases in China and Japan
Apache Kylin Use Cases in China and Japan
Luke Han1.2K visualizações
Apache Kylin – Cubes on Hadoop por DataWorks Summit
Apache Kylin – Cubes on HadoopApache Kylin – Cubes on Hadoop
Apache Kylin – Cubes on Hadoop
DataWorks Summit8.5K visualizações
Apache Kylin 1.5 Updates por Yang Li
Apache Kylin 1.5 UpdatesApache Kylin 1.5 Updates
Apache Kylin 1.5 Updates
Yang Li1.1K visualizações
Apache Kylin - Balance between space and time - Hadoop Summit 2015 por Debashis Saha
Apache Kylin -  Balance between space and time - Hadoop Summit 2015Apache Kylin -  Balance between space and time - Hadoop Summit 2015
Apache Kylin - Balance between space and time - Hadoop Summit 2015
Debashis Saha2.8K visualizações
Apache Kylin @ Big Data Europe 2015 por Seshu Adunuthula
Apache Kylin @ Big Data Europe 2015Apache Kylin @ Big Data Europe 2015
Apache Kylin @ Big Data Europe 2015
Seshu Adunuthula2.6K visualizações
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive por Xu Jiang
Apache Kylin: OLAP Engine on Hadoop - Tech Deep DiveApache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Xu Jiang33.1K visualizações
Apache kylin 2.0: from classic olap to real-time data warehouse por Yang Li
Apache kylin 2.0: from classic olap to real-time data warehouseApache kylin 2.0: from classic olap to real-time data warehouse
Apache kylin 2.0: from classic olap to real-time data warehouse
Yang Li2K visualizações
Apache Kylin - OLAP Cubes for SQL on Hadoop por Ted Dunning
Apache Kylin - OLAP Cubes for SQL on HadoopApache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on Hadoop
Ted Dunning8.5K visualizações
Datacubes in Apache Hive at ApacheCon por amarsri
Datacubes in Apache Hive at ApacheConDatacubes in Apache Hive at ApacheCon
Datacubes in Apache Hive at ApacheCon
amarsri3.5K visualizações
Design cube in Apache Kylin por Yang Li
Design cube in Apache KylinDesign cube in Apache Kylin
Design cube in Apache Kylin
Yang Li13.2K visualizações
Apache Kylin - Balance Between Space and Time por DataWorks Summit
Apache Kylin - Balance Between Space and TimeApache Kylin - Balance Between Space and Time
Apache Kylin - Balance Between Space and Time
DataWorks Summit2.5K visualizações

Similar a Apache kylin - Big Data Technology Conference 2014 Beijing

HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop por
HBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for HadoopHBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for Hadoop
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for HadoopHBaseCon
3.3K visualizações21 slides
ApacheKylin_HBaseCon2015 por
ApacheKylin_HBaseCon2015ApacheKylin_HBaseCon2015
ApacheKylin_HBaseCon2015Luke Han
433 visualizações21 slides
Accelerating Big Data Analytics with Apache Kylin por
Accelerating Big Data Analytics with Apache KylinAccelerating Big Data Analytics with Apache Kylin
Accelerating Big Data Analytics with Apache KylinTyler Wishnoff
1.1K visualizações28 slides
Cloud-native Semantic Layer on Data Lake por
Cloud-native Semantic Layer on Data LakeCloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeDatabricks
684 visualizações26 slides
Kylin and Druid Presentation por
Kylin and Druid PresentationKylin and Druid Presentation
Kylin and Druid Presentationargonauts007
6.6K visualizações38 slides
Apache kylin boost your sqls on extremely large dataset por
Apache kylin boost your sqls on extremely large datasetApache kylin boost your sqls on extremely large dataset
Apache kylin boost your sqls on extremely large datasetssuser931288
57 visualizações30 slides

Similar a Apache kylin - Big Data Technology Conference 2014 Beijing(20)

HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop por HBaseCon
HBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for HadoopHBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for Hadoop
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop
HBaseCon3.3K visualizações
ApacheKylin_HBaseCon2015 por Luke Han
ApacheKylin_HBaseCon2015ApacheKylin_HBaseCon2015
ApacheKylin_HBaseCon2015
Luke Han433 visualizações
Accelerating Big Data Analytics with Apache Kylin por Tyler Wishnoff
Accelerating Big Data Analytics with Apache KylinAccelerating Big Data Analytics with Apache Kylin
Accelerating Big Data Analytics with Apache Kylin
Tyler Wishnoff1.1K visualizações
Cloud-native Semantic Layer on Data Lake por Databricks
Cloud-native Semantic Layer on Data LakeCloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data Lake
Databricks684 visualizações
Kylin and Druid Presentation por argonauts007
Kylin and Druid PresentationKylin and Druid Presentation
Kylin and Druid Presentation
argonauts0076.6K visualizações
Apache kylin boost your sqls on extremely large dataset por ssuser931288
Apache kylin boost your sqls on extremely large datasetApache kylin boost your sqls on extremely large dataset
Apache kylin boost your sqls on extremely large dataset
ssuser93128857 visualizações
Apache kylin boost your SQLs on extremely large dataset por Chun'en Ni
Apache kylin boost your SQLs on extremely large datasetApache kylin boost your SQLs on extremely large dataset
Apache kylin boost your SQLs on extremely large dataset
Chun'en Ni82 visualizações
PCM18 (Big Data Analytics) por Stratebi
PCM18 (Big Data Analytics)PCM18 (Big Data Analytics)
PCM18 (Big Data Analytics)
Stratebi1.2K visualizações
Build 2017 - P4002 - Speedup Interactive Analytics on Petabytes of Data on Azure por Windows Developer
Build 2017 - P4002 - Speedup Interactive Analytics on Petabytes of Data on AzureBuild 2017 - P4002 - Speedup Interactive Analytics on Petabytes of Data on Azure
Build 2017 - P4002 - Speedup Interactive Analytics on Petabytes of Data on Azure
Windows Developer138 visualizações
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi por Databricks
 Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
Databricks3.7K visualizações
HBaseConAsia2018 Track3-5: HBase Practice at Lianjia por Michael Stack
HBaseConAsia2018 Track3-5: HBase Practice at LianjiaHBaseConAsia2018 Track3-5: HBase Practice at Lianjia
HBaseConAsia2018 Track3-5: HBase Practice at Lianjia
Michael Stack537 visualizações
HQL over Tiered Data Warehouse por DataWorks Summit
HQL over Tiered Data WarehouseHQL over Tiered Data Warehouse
HQL over Tiered Data Warehouse
DataWorks Summit1.4K visualizações
Apache Kylin and Use Cases - 2018 Big Data Spain por Luke Han
Apache Kylin and Use Cases - 2018 Big Data SpainApache Kylin and Use Cases - 2018 Big Data Spain
Apache Kylin and Use Cases - 2018 Big Data Spain
Luke Han1.3K visualizações
Grill at HadoopSummit por amarsri
Grill at HadoopSummitGrill at HadoopSummit
Grill at HadoopSummit
amarsri2.2K visualizações
HBaseConAsia2018 Track2-2: Apache Kylin on HBase: Extreme OLAP for big data por Michael Stack
HBaseConAsia2018  Track2-2: Apache Kylin on HBase: Extreme OLAP for big dataHBaseConAsia2018  Track2-2: Apache Kylin on HBase: Extreme OLAP for big data
HBaseConAsia2018 Track2-2: Apache Kylin on HBase: Extreme OLAP for big data
Michael Stack1.7K visualizações
Unlocking big data with Hadoop + MySQL por Ricky Setyawan
Unlocking big data with Hadoop + MySQLUnlocking big data with Hadoop + MySQL
Unlocking big data with Hadoop + MySQL
Ricky Setyawan334 visualizações
When OLAP Meets Real-Time, What Happens in eBay? por DataWorks Summit
When OLAP Meets Real-Time, What Happens in eBay?When OLAP Meets Real-Time, What Happens in eBay?
When OLAP Meets Real-Time, What Happens in eBay?
DataWorks Summit954 visualizações
Dataflow in 104corp - AWS UserGroup TW 2018 por Gavin Lin
Dataflow in 104corp - AWS UserGroup TW 2018Dataflow in 104corp - AWS UserGroup TW 2018
Dataflow in 104corp - AWS UserGroup TW 2018
Gavin Lin176 visualizações
Data Science por Ahmet Bulut
Data ScienceData Science
Data Science
Ahmet Bulut945 visualizações

Mais de Luke Han

Augmented OLAP for Big Data por
Augmented OLAP for Big DataAugmented OLAP for Big Data
Augmented OLAP for Big DataLuke Han
10.6K visualizações37 slides
Refactoring your EDW with Mobile Analytics Products por
Refactoring your EDW with Mobile Analytics ProductsRefactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics ProductsLuke Han
312 visualizações48 slides
Building Enterprise OLAP on Hadoop for FSI por
Building Enterprise OLAP on Hadoop for FSIBuilding Enterprise OLAP on Hadoop for FSI
Building Enterprise OLAP on Hadoop for FSILuke Han
981 visualizações30 slides
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai por
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
3. Apache Tez Introducation - Apache Kylin Meetup @ShanghaiLuke Han
970 visualizações24 slides
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai por
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @ShanghaiLuke Han
3.7K visualizações16 slides
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai por
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @ShanghaiLuke Han
4.2K visualizações63 slides

Mais de Luke Han(7)

Augmented OLAP for Big Data por Luke Han
Augmented OLAP for Big DataAugmented OLAP for Big Data
Augmented OLAP for Big Data
Luke Han10.6K visualizações
Refactoring your EDW with Mobile Analytics Products por Luke Han
Refactoring your EDW with Mobile Analytics ProductsRefactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics Products
Luke Han312 visualizações
Building Enterprise OLAP on Hadoop for FSI por Luke Han
Building Enterprise OLAP on Hadoop for FSIBuilding Enterprise OLAP on Hadoop for FSI
Building Enterprise OLAP on Hadoop for FSI
Luke Han981 visualizações
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai por Luke Han
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
Luke Han970 visualizações
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai por Luke Han
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
Luke Han3.7K visualizações
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai por Luke Han
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
Luke Han4.2K visualizações
Actuate presentation 2011 por Luke Han
Actuate presentation   2011Actuate presentation   2011
Actuate presentation 2011
Luke Han1.2K visualizações

Último

20231129 - Platform @ localhost 2023 - Application-driven infrastructure with... por
20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...
20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...sparkfabrik
5 visualizações46 slides
Software evolution understanding: Automatic extraction of software identifier... por
Software evolution understanding: Automatic extraction of software identifier...Software evolution understanding: Automatic extraction of software identifier...
Software evolution understanding: Automatic extraction of software identifier...Ra'Fat Al-Msie'deen
9 visualizações33 slides
Navigating container technology for enhanced security by Niklas Saari por
Navigating container technology for enhanced security by Niklas SaariNavigating container technology for enhanced security by Niklas Saari
Navigating container technology for enhanced security by Niklas SaariMetosin Oy
14 visualizações34 slides
AI and Ml presentation .pptx por
AI and Ml presentation .pptxAI and Ml presentation .pptx
AI and Ml presentation .pptxFayazAli87
11 visualizações15 slides
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols por
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - DolsDSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - DolsDeltares
7 visualizações23 slides
Quality Engineer: A Day in the Life por
Quality Engineer: A Day in the LifeQuality Engineer: A Day in the Life
Quality Engineer: A Day in the LifeJohn Valentino
6 visualizações18 slides

Último(20)

20231129 - Platform @ localhost 2023 - Application-driven infrastructure with... por sparkfabrik
20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...
20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...
sparkfabrik5 visualizações
Software evolution understanding: Automatic extraction of software identifier... por Ra'Fat Al-Msie'deen
Software evolution understanding: Automatic extraction of software identifier...Software evolution understanding: Automatic extraction of software identifier...
Software evolution understanding: Automatic extraction of software identifier...
Ra'Fat Al-Msie'deen9 visualizações
Navigating container technology for enhanced security by Niklas Saari por Metosin Oy
Navigating container technology for enhanced security by Niklas SaariNavigating container technology for enhanced security by Niklas Saari
Navigating container technology for enhanced security by Niklas Saari
Metosin Oy14 visualizações
AI and Ml presentation .pptx por FayazAli87
AI and Ml presentation .pptxAI and Ml presentation .pptx
AI and Ml presentation .pptx
FayazAli8711 visualizações
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols por Deltares
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - DolsDSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols
Deltares7 visualizações
Quality Engineer: A Day in the Life por John Valentino
Quality Engineer: A Day in the LifeQuality Engineer: A Day in the Life
Quality Engineer: A Day in the Life
John Valentino6 visualizações
Myths and Facts About Hospice Care: Busting Common Misconceptions por Care Coordinations
Myths and Facts About Hospice Care: Busting Common MisconceptionsMyths and Facts About Hospice Care: Busting Common Misconceptions
Myths and Facts About Hospice Care: Busting Common Misconceptions
Care Coordinations5 visualizações
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx por animuscrm
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
animuscrm14 visualizações
nintendo_64.pptx por paiga02016
nintendo_64.pptxnintendo_64.pptx
nintendo_64.pptx
paiga020165 visualizações
tecnologia18.docx por nosi6702
tecnologia18.docxtecnologia18.docx
tecnologia18.docx
nosi67025 visualizações
SUGCON ANZ Presentation V2.1 Final.pptx por Jack Spektor
SUGCON ANZ Presentation V2.1 Final.pptxSUGCON ANZ Presentation V2.1 Final.pptx
SUGCON ANZ Presentation V2.1 Final.pptx
Jack Spektor22 visualizações
DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h... por Deltares
DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h...DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h...
DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h...
Deltares5 visualizações
Ports-and-Adapters Architecture for Embedded HMI por Burkhard Stubert
Ports-and-Adapters Architecture for Embedded HMIPorts-and-Adapters Architecture for Embedded HMI
Ports-and-Adapters Architecture for Embedded HMI
Burkhard Stubert6 visualizações
Keep por Geniusee
KeepKeep
Keep
Geniusee75 visualizações
FOSSLight Community Day 2023-11-30 por Shane Coughlan
FOSSLight Community Day 2023-11-30FOSSLight Community Day 2023-11-30
FOSSLight Community Day 2023-11-30
Shane Coughlan5 visualizações
Unlocking the Power of AI in Product Management - A Comprehensive Guide for P... por NimaTorabi2
Unlocking the Power of AI in Product Management - A Comprehensive Guide for P...Unlocking the Power of AI in Product Management - A Comprehensive Guide for P...
Unlocking the Power of AI in Product Management - A Comprehensive Guide for P...
NimaTorabi28 visualizações
The Era of Large Language Models.pptx por AbdulVahedShaik
The Era of Large Language Models.pptxThe Era of Large Language Models.pptx
The Era of Large Language Models.pptx
AbdulVahedShaik5 visualizações
Dapr Unleashed: Accelerating Microservice Development por Miroslav Janeski
Dapr Unleashed: Accelerating Microservice DevelopmentDapr Unleashed: Accelerating Microservice Development
Dapr Unleashed: Accelerating Microservice Development
Miroslav Janeski10 visualizações
Gen Apps on Google Cloud PaLM2 and Codey APIs in Action por Márton Kodok
Gen Apps on Google Cloud PaLM2 and Codey APIs in ActionGen Apps on Google Cloud PaLM2 and Codey APIs in Action
Gen Apps on Google Cloud PaLM2 and Codey APIs in Action
Márton Kodok5 visualizações

Apache kylin - Big Data Technology Conference 2014 Beijing

  • 1. http://kylin.io Apache Kylin Introduction Dec 14, 2014 韩卿|Luke Han Co-creator of Apache Kylin | lukehan@apache.org Sr. Product Manager, eBay CCOE
  • 2. http://kylin.io Agenda  What’s Apache Kylin?  Feature & Tech Highlights  Performance  Open Source & Roadmap  Q & A
  • 3. http://kylin.io Extreme OLAP Engine for Big Data Kylin is an open source Distributed Analytics Engine from eBay that provides SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets What’s Kylin kylin / ˈkiːˈlɪn / 麒麟 --n. (in Chinese art) a mythical animal of composite form • Open Sourced on Oct 1st, 2014 • Be accepted as Apache Incubator Project on Nov 25th, 2014
  • 4. http://kylin.io Big Data Era  More and more data becoming available on Hadoop  Limitations in existing Business Intelligence (BI) Tools  Limited support for Hadoop  Data size growing exponentially  High latency of interactive queries  Scale-Up architecture  Challenges to adopt Hadoop as interactive analysis system  Majority of analyst groups are SQL savvy  No mature SQL interface on Hadoop  OLAP capability on Hadoop ecosystem not ready yet
  • 5. http://kylin.io Business Needs for Big Data Analysis  Sub-second query latency on billions of rows  ANSI SQL for both analysts and engineers  Full OLAP capability to offer advanced functionality  Seamless Integration with BI Tools  Support of high cardinality and high dimensions  High concurrency – thousands of end users  Distributed and scale out architecture for large data volume
  • 6. http://kylin.io6 Why not Build an engine from scratch?
  • 7. http://kylin.io Transaction Operation Strategy High Level Aggregation •Very High Level, e.g GMV by site by vertical by weeks Analysis Query •Middle level, e.g GMV by site by vertical, by category (level x) past 12 weeks Drill Down to Detail •Detail Level (Summary Table) Low Level Aggregation •First Level Aggragation Transaction Level •Transaction Data Analytics Query Taxonomy OLAP Kylin is designed to accelerate 80+% analytics queries performance on Hadoop OLTP
  • 8. http://kylin.io  Huge volume data  Table scan  Big table joins  Data shuffling  Analysis on different granularity  Runtime aggregation expensive  Map Reduce job  Batch processing Technical Challenges
  • 9. http://kylin.io OLAP Cube – Balance between Space and Time time, item time, item, location time, item, location, supplier time item location supplier time, location Time, supplier item, location item, supplier location, supplier time, item, supplier time, location, supplier item, location, supplier 0-D(apex) cuboid 1-D cuboids 2-D cuboids 3-D cuboids 4-D(base) cuboid • Base vs. aggregate cells; ancestor vs. descendant cells; parent vs. child cells 1. (9/15, milk, Urbana, Dairy_land) - <time, item, location, supplier> 2. (9/15, milk, Urbana, *) - <time, item, location> 3. (*, milk, Urbana, *) - <item, location> 4. (*, milk, Chicago, *) - <item, location> 5. (*, milk, *, *) - <item> • Cuboid = one combination of dimensions • Cube = all combination of dimensions (all cuboids)
  • 11. http://kylin.io Kylin Architecture Overview 11 Cube Build Engine (MapReduce…) SQL Low Latency - Seconds Mid Latency - Minutes Routing 3rd Party App (Web App, Mobile…) Metadata SQL-Based Tool (BI Tools: Tableau…) Query Engine Hadoop Hive REST API JDBC/ODBC  Online Analysis Data Flow  Offline Data Flow  Clients/Users interactive with Kylin via SQL  OLAP Cube is transparent to users Star Schema Data Key Value Data Data Cube OLAP Cube (HBase) SQL REST Server
  • 12. http://kylin.io  Hive  Input source  Pre-join star schema during cube building  MapReduce  Pre-aggregation metrics during cube building  HDFS  Store intermediated files during cube building.  HBase  Store data cube.  Serve query on data cube.  Coprocessor is used for query processing. How Does Kylin Utilize Hadoop Components?
  • 13. http://kylin.io Agenda  What’s Apache Kylin?  Feature & Tech Highlights  Performance  Open Source & Roadmap  Q & A
  • 14. http://kylin.io  Extremely Fast OLAP Engine at Scale Kylin is designed to reduce query latency on Hadoop for 10+ billions of rows of data  ANSI SQL Interface on Hadoop Kylin offers ANSI SQL on Hadoop and supports most ANSI SQL query functions  Seamless Integration with BI Tools Kylin currently offers integration capability with BI Tools like Tableau.  Interactive Query Capability Users can interact with Hadoop data via Kylin at sub-second latency, better than Hive queries for the same dataset  MOLAP Cube User can define a data model and pre-build in Kylin with more than 10+ billions of raw data records Features Highlights
  • 15. http://kylin.io  Compression and Encoding Support  Incremental Refresh of Cubes  Approximate Query Capability for distinct Count (HyperLogLog)  Leverage HBase Coprocessor for query latency  Job Management and Monitoring  Easy Web interface to manage, build, monitor and query cubes  Security capability to set ACL at Cube/Project Level  Support LDAP Integration Features Highlights…
  • 20. http://kylin.io Data Modeling Cube: … Fact Table: … Dimensions: … Measures: … Storage(HBase): …Fact Dim Dim Dim Source Star Schema row A row B row C Column Family Val 1 Val 2 Val 3 Row Key Column Target HBase Storage Mapping Cube Metadata End User Cube Modeler Admin
  • 22. http://kylin.io How To Store Cube? – HBase Schema
  • 23. http://kylin.io Query Engine – Calcite (Optiq)  Dynamic data management framework.  Formerly known as Optiq, Calcite is an Apache incubator project, used by Apache Drill and Apache Hive, among others.  http://optiq.incubator.apache.org
  • 24. http://kylin.io Query Engine – Kylin Explain Plan SELECT test_cal_dt.week_beg_dt, test_category.category_name, test_category.lvl2_name, test_category.lvl3_name, test_kylin_fact.lstg_format_name, test_sites.site_name, SUM(test_kylin_fact.price) AS GMV, COUNT(*) AS TRANS_CNT FROM test_kylin_fact LEFT JOIN test_cal_dt ON test_kylin_fact.cal_dt = test_cal_dt.cal_dt LEFT JOIN test_category ON test_kylin_fact.leaf_categ_id = test_category.leaf_categ_id AND test_kylin_fact.lstg_site_id = test_category.site_id LEFT JOIN test_sites ON test_kylin_fact.lstg_site_id = test_sites.site_id WHERE test_kylin_fact.seller_id = 123456OR test_kylin_fact.lstg_format_name = ’New' GROUP BY test_cal_dt.week_beg_dt, test_category.category_name, test_category.lvl2_name, test_category.lvl3_name, test_kylin_fact.lstg_format_name,test_sites.site_name OLAPToEnumerableConverter OLAPProjectRel(WEEK_BEG_DT=[$0], category_name=[$1], CATEG_LVL2_NAME=[$2], CATEG_LVL3_NAME=[$3], LSTG_FORMAT_NAME=[$4], SITE_NAME=[$5], GMV=[CASE(=($7, 0), null, $6)], TRANS_CNT=[$8]) OLAPAggregateRel(group=[{0, 1, 2, 3, 4, 5}], agg#0=[$SUM0($6)], agg#1=[COUNT($6)], TRANS_CNT=[COUNT()]) OLAPProjectRel(WEEK_BEG_DT=[$13], category_name=[$21], CATEG_LVL2_NAME=[$15], CATEG_LVL3_NAME=[$14], LSTG_FORMAT_NAME=[$5], SITE_NAME=[$23], PRICE=[$0]) OLAPFilterRel(condition=[OR(=($3, 123456), =($5, ’New'))]) OLAPJoinRel(condition=[=($2, $25)], joinType=[left]) OLAPJoinRel(condition=[AND(=($6, $22), =($2, $17))], joinType=[left]) OLAPJoinRel(condition=[=($4, $12)], joinType=[left]) OLAPTableScan(table=[[DEFAULT, TEST_KYLIN_FACT]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]]) OLAPTableScan(table=[[DEFAULT, TEST_CAL_DT]], fields=[[0, 1]]) OLAPTableScan(table=[[DEFAULT, test_category]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]]) OLAPTableScan(table=[[DEFAULT, TEST_SITES]], fields=[[0, 1, 2]])
  • 25. http://kylin.io  Full Cube  Pre-aggregate all dimension combinations  “Curse of dimensionality”: N dimension cube has 2N cuboid.  Partial Cube  To avoid dimension explosion, we divide the dimensions into different aggregation groups  2N+M+L  2N + 2M + 2L  For cube with 30 dimensions, if we divide these dimensions into 3 group, the cuboid number will reduce from 1 Billion to 3 Thousands  230  210 + 210 + 210  Tradeoff between online aggregation and offline pre-aggregation How To Optimize Cube? – Full Cube vs. Partial Cube
  • 26. http://kylin.io How To Optimize Cube? – Partial Cube
  • 27. http://kylin.io How To Optimize Cube? – Incremental Building
  • 28. http://kylin.io Inverted Index  Challenge  Has no raw data records  Slow table scan on high cardinality dimensions  Inverted Index Storage (an ongoing effort)  Persist the raw table  Bitmap inverted index  Time range partition  In-memory (block cache)  Parallel scan (endpoint coprocessor)
  • 29. http://kylin.io Agenda  What’s Apache Kylin?  Feature & Tech Highlights  Performance  Open Source & Roadmap  Q & A
  • 30. http://kylin.io Kylin vs. Hive # Query Type Return Dataset Query On Kylin (s) Query On Hive (s) Comments 1 High Level Aggregation 4 0.129 157.437 1,217 times 2 Analysis Query 22,669 1.615 109.206 68 times 3 Drill Down to Detail 325,029 12.058 113.123 9 times 4 Drill Down to Detail 524,780 22.42 6383.21 278 times 5 Data Dump 972,002 49.054 N/A 0 50 100 150 200 SQL #1 SQL #2 SQL #3 Hive Kylin High Level Aggregatio n Analysis Query Drill Down to Detail Low Level Aggregatio n Transactio n Level Based on 12+B records case
  • 32. http://kylin.io Performance - Query Latency 90%tile queries <5s Green Line: 90%tile queries Gray Line: 95%tile queries
  • 33. http://kylin.io Agenda  What’s Apache Kylin?  Feature & Tech Highlights  Performance  Open Source & Roadmap  Q & A
  • 34. http://kylin.io  Kylin Core  Fundamental framework of Kylin OLAP Engine  Extension  Plugins to support for additional functions and features  Integration  Lifecycle Management Support to integrate with other applications  Interface  Allows for third party users to build more features via user- interface atop Kylin core  Driver  ODBC and JDBC Drivers Kylin OLAP Core Extension  Security  Redis Storage  Spark Engine  Docker Interface  Web Console  Customized BI  Ambari/Hue Plugin Integration  ODBC Driver  ETL  Drill Kylin Ecosystem
  • 35. http://kylin.io Kylin Evolution Roadmap 201520142013 Initial Prototype for MOLAP • Basic end to end POC MOLAP • Incremental Refresh • ANSI SQL • ODBC Driver • Web GUI • ACL • Open Source HOLAP • InvertedIndex • JDBC Driver • Automation • Capacity Management • Support Spark • … more Next Gen • In-Memory Analysis • Real Time & Streaming (TBD) • … more TBD Future… Sep, 2013 Jan, 2014 Sep, 2014 Q1, 2015
  • 36. http://kylin.io  Kylin Site:  http://kylin.io  Twitter:  @ApacheKylin  Source Code Repo:  https://github.com/KylinOLAP  Google Group:  Kylin OLAP Open Source
  • 37. http://kylin.io  时间:  2014-12-14 6:30 PM – 9:00 PM  地点:  3W咖啡 Apache Kylin 北京线下交流会