SlideShare a Scribd company logo
1 of 39
Download to read offline
Apache Kylin & Use Cases
Luke Han | luke.han@kyligence.io
2018 Big Data Spain
Luke Han
• Co-founder & CEO at Kyligence
• Co-creator and PMC Chairof Apache Kylin
• Apache Software FoundationMember
• Microsoft RegionalDirector & MVP
• Former eBay Big Data Product Manager Lead
© Kyligence Inc. 2018.
About Luke Han
Kyligence = Kylin + Intelligence
- Kyligence is formed bythe team who created ApacheKylin, leading opensource OLAP for Big
Data. Kyligence provides an intelligent data warehouse built fordata cognitive analytics at web
scale.
- Funding by leading VCs:
- Redpoint Ventures, Cisco,
- CBC Capital and Shunwei Capital,
- Eight Roads Ventures (Fidelity International Arm)
- CRN Top 10 Big Data Startups 2018
© Kyligence Inc. 2018.
About Kyligence
Apache Kylin
© Kyligence Inc. 2018.
About Apache Kylin
• Leading Open Source OLAP for Big Data
• Open sourced by eBay in 2014
• Graduated to Apache Top Project in 2015
• 1000+ Adoptions world wild
• 2015 InfoWorld Bossie Awards
• 2016 InfoWorld Bossie Awards
© Kyligence Inc. 2018.
1000+ Global Users
Apache Kylin - Leading Open Source OLAP for Big Data
© Kyligence Inc. 2018.
Presentation
Visualization
Data
Lake
Data
Source
o Too many options
o Low performance
o Long learning curve
o Compatibility issue
o Technology vs Data
OLAP: The Missing Part of Big Data
Hive Impala Spark
SQL
Drill
MapReduce …Spark
© Kyligence Inc. 2018.
Presentation
Visualization
Data
Lake
Data
Source
o SQL Acceleration for Big Data
o Semantic Layer
o Speed up Analytics
o ANSI SQL Interface
o High Performance and High
Concurrency
Apache Kylin: Bring OLAP back to Big Data
OLAP
Data Mart
Hive Impala Spark SQL Drill
MapReduce …Spark
Apache Kylin
Technical Highlights
© Kyligence Inc. 2018.
OLAP and OLAP Cube
Online analytical processing, or OLAP,
is an approach to answering multi-
dimensional analytical (MDA) queries
swiftly in computing. – Wikipedia
Basic operations
– Roll-up
– Drill-down
– Slice and dice
– Pivot
OLAP cube is a data
structure optimized for
very quick data analysis.
© Kyligence Inc. 2018.
Cube: balance between space and time
OLAP Cube
--Key-Value
Multiple Dimensional Model
--Relational
Classification,
aggregation, and
sorting
© Kyligence Inc. 2018.
Apache Kylin Architecture Overview
Apache Kylin
Data Analyst, BI Tools, Web App…
SQL
Online calculation
Offline calculation
Scan & filter
Extract
Compute
Load
Optimize & Rewrite
© Kyligence Inc. 2018.
SQL execution plan without Cube
select
l_returnflag,
o_orderstatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price
from
v_lineitem
inner join
v_orders on l_orderkey = o_orderkey
where
l_shipdate <= '1998-09-16'
group by
l_returnflag,
o_orderstatus
order by
l_returnflag,
o_orderstatus;
Sample:Check the order return and order status relationship in a time range
Sort
Aggr.
Filter
Tables
O(N)
Join
No cube, all need online
calculations, CPU and IO
intensive, latency is
remarkable.
© Kyligence Inc. 2018.
SQL execution plan with Cube
Cube technology speed up query performance with pre-calculation
Sort
Cube
Filter
Sort
Aggr.
Filter
Tables
O(N)
Join
O(flag x status x days) = O(1)
Aggregated data
The table join
and aggregation
are completed
offline.
Directly from aggregated
data (cube) with index;
Much less CPU and IO.
Latency is small.
© Kyligence Inc. 2018.
ORDERS
CUSTOMER
SUPPLIER
PART
LINEITEM
PARTSUPP
NATION
REGION
Join
Join
Join
Join
Join
ORDERS
CUSTOMER
PART
LINEITEM
PARTSUPP
Join
Join
Join
All rights reserved ©Kyligence Inc.
http://kyligence.io
Multidimensional Schema
Apach Kylin supports Star-Schema, Snowflake-Schema
© Kyligence Inc. 2018.
Persistent the cube in HBase
Relational to Key Value store
© Kyligence Inc. 2018.
How to query the cube
Translate cube query into HBase table scan
– Columns, Group by → Cuboid ID
– Filters -> Scan Range (Row Key)
– Aggregations -> Measure Columns (Row Values)
Scan HBase table and translate HBase result into cube result
– HBase Result (key + value) -> Cube Result (dimensions +
measures)
No Hive touch, no MapReduce job in the query time
© Kyligence Inc. 2018.
High performance & High concurrency together
Sub-second latency on PB scale dataset
Star schema benchmark:
http://www.cs.umb.edu/~poneil/StarSchemaB.PDF
SQL Latency
Lower is better
Data VolumeScale
Lower is better
© Kyligence Inc. 2018.
Seamless integration with BI tools
From open source to commercial BI
Apache Kylin
Use Cases
© Kyligence Inc. 2018.
Apache Kylin Use Cases
Solution
• Behavior Analytics
• LogAnalysis
• Data Mart/DW
• Self-service Data Service
• Retail Analytics
• Financial Asset
• Advertising Analytics
• Real-time Analytics
• Gaming Analysis
Apache Kylin fits various scenarios
1000+ adoptions all of the world
© Kyligence Inc. 2018.
Use Case – Insight on Trillion Data
Top1 news feed app in China
© Kyligence Inc. 2018.
Use Case: PB-level Analytics Platform
Cube Storage: 971TB (almost PB)
Cube numbers: 973 Cube
Data Records: 8.9 Trillion rows
90%ile latency: <1.2s
Frequency: 3.8 million queries / day
Top O2O services provider in China
Supporting all critical business
lines including E-Takeaways,
Hotel, Movie, LBS, Tickets…
Latest updated -201808
© Kyligence Inc. 2018.
Use Case: Online Shopping Reporting
https://techblog.yahoo.co.jp/oss/apache-kylin/
▪ Our reporting system used Impala as a backend database
previously.
- It took a long time (about 60 sec) to show Web UI.
▪ In order to lower the latency, we moved to Apache Kylin.
- Average latency < 1sec for most cases
▪ Thanks to low latency with Kylin, we become possible
to focus on adding functions for users.
▪ We provide a reporting system that show statistics
for store owners.
- e. g. impressions, clicks and sales.
The most visited website in Japan
Yahoo! Japan
© Kyligence Inc. 2018.
Use Case: Data Factory for Business
• Serving 18 business lines as the engine for mi’s“data factory”
• Daily incremental 17 billion
• 95% queries < 500ms.
Leading smart phone and smart device manufacture
© Kyligence Inc. 2018.
The data platform based on Apache Kylin solved the problem of massive user
queries excellently.
-- Chase Zhang, Data Platform Engineer of Strikingly
Performance
• Use Apache Kylin to speedup analytics
with Keen.io, and support high
concurrency
Containerizing
• Apache Kylin runs on AWS ECS
Integration
• Developed a scheduler systemto
manage all kinds of jobs
Use Case – Website traffic Analytics
A company to provide convenient and one stop website building solutions.
Apache Kylin
Roadmap
© Kyligence Inc. 2018.
Apache Kylin Roadmap
• New storage support
–Parquet
• Real-time support
• Containerization
From the community
About Kyligence
Traditional Data Warehousing
EnormousManual Effortsand Repeated Work
© Kyligence Inc. 2018, Confidential.
Human
Intelligence
Intelligence and Automation
The future of DataAnalytics
Artificial
IntelligenceVS
Augmented Analytics Platform
SQL
Query Log
Analytic
Behavior
Data
Schema
Data
Profile
ML-based
Discoveryof
Analytic Pattern
ProprietaryData
Modeling
Automation
Self-directed
Storage Layer
Optimization
Intelligent
QueryPush-
down &Routing
BI
Real-time
Analysis
Data-as-a-
Service
Local
Deployment
Cloud
Platform
Container
Data
Services
© Kyligence Inc. 2018, Confidential.
Machine Learning
Augmented Analytics
Available from Kyligence 3.x
http://kyligence.io
© Kyligence Inc. 2018.
Kyligence Cloud
Transforming Big Data Analytics to Cloud
Kyligence Cloud
ANSI SQL
Dashboard OLAP
Hadoop
Customer Cloud Account
client
cloud
Kyligence Enterprise Platform
streaming
Cluster Deploy
Account Management
Diagnosis &
Optimization
Queries & Reporting
cloud
storage
tables, logs, files
RDBMS
(metadata)
ANSI SQL
Cloud Data
Warehouse
Cluster Management
© Kyligence Inc. 2018.
Kyligence Cloud
Available: AWS, Azure, Google Cloud, Alibaba Cloud , Huawei Cloud
One-click
provisioning
Auto Scaling
High
Performance
Seamless
Integration
Intelligent
Ops
Deploy globally in 30
minutes
Scale cluster
automatically for
different workloads
Powered by Kyligence
Analytics Platform
Connect to cloud data
sources
Enterprise ODBC driver
for BI
Online diagnosis and
continuous
optimization
Speed Upmission-critical analytics in the cloud
© Kyligence Inc. 2018.
Use Case : Replaced IBM Cognos
1 Kyligence cube replaced 800+ IBM Cognos cubes
PB level (300B records)
big data warehouse of both
self-service aggregation
query and raw data query by
business analysts
Self-Service
Big Data Warehouse
Efficient
IT Operation
Significantly increase IT
operation efficiency
as 1 Kyligence cube
replacing 800 Cognos
cubes with unified data
access management
Kyligence scale-out
architecture provide best
flexibility for IT infrastructure
when faced with increasing
analytics and concurrency
demands
Better flexibility
of Architecture
Support analysis on high
granularity dimensions such
as Merchant (10M
cardinality) and Card (10B
cardinality)
Merchant or Card
Multi-dimensional Analytics
© Kyligence Inc. 2018.
Use Case: Customer 360 for FMCG
Azure + Kyligence
➢ 360 degree view of user profile.
➢ Powering analysts insight into
data without IT
➢ HDInsight + Kyligence + Power BI
© Kyligence Inc. 2018.
Global Partners
Kyligence Open Ecosystem
Microsoft Azure Partner
AWS Technology Partner
Tableau Technology Partner
Cloudera Sliver Partner
MapR Converge Partner
Hortonworks Community Partner
Huawei Solution Partner
Q & A
luke.han@kyligence.io

More Related Content

What's hot

Accumulo Summit 2014: Accumulo with Distributed SQL queries
Accumulo Summit 2014: Accumulo with Distributed SQL queriesAccumulo Summit 2014: Accumulo with Distributed SQL queries
Accumulo Summit 2014: Accumulo with Distributed SQL queriesAccumulo Summit
 
Using Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataUsing Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataDataWorks Summit/Hadoop Summit
 
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark Summit
 
Apache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsDr. Mirko Kämpf
 
How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?Jeraldine Phneah
 
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)Spark Summit
 
Real-time Distributed Stream Processing @ Scale
Real-time Distributed Stream Processing@ ScaleReal-time Distributed Stream Processing@ Scale
Real-time Distributed Stream Processing @ ScaleJerome Boulon
 
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...Spark Summit
 
Audi‘s Hadoop Journey into the Hybrid Cloud
Audi‘s Hadoop Journey into the Hybrid CloudAudi‘s Hadoop Journey into the Hybrid Cloud
Audi‘s Hadoop Journey into the Hybrid CloudDataWorks Summit
 
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...Cloudera, Inc.
 
Data to Drive Decision-Making - CaliStream Meetup
Data to Drive Decision-Making - CaliStream MeetupData to Drive Decision-Making - CaliStream Meetup
Data to Drive Decision-Making - CaliStream MeetupJerome Boulon
 
Spark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark and Online Analytics: Spark Summit East talky by Shubham ChopraSpark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark and Online Analytics: Spark Summit East talky by Shubham ChopraSpark Summit
 
Real-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in ActionReal-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in ActionDataWorks Summit
 
Data Warehousing Patterns for Hadoop
Data Warehousing Patterns for HadoopData Warehousing Patterns for Hadoop
Data Warehousing Patterns for HadoopMichelle Ufford
 
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)Holden Ackerman
 
Big Data at Pinterest - Presented by Qubole
Big Data at Pinterest - Presented by QuboleBig Data at Pinterest - Presented by Qubole
Big Data at Pinterest - Presented by QuboleQubole
 
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...Codemotion
 
Spark at Airbnb
Spark at AirbnbSpark at Airbnb
Spark at AirbnbHao Wang
 
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...Qubole
 

What's hot (20)

Accumulo Summit 2014: Accumulo with Distributed SQL queries
Accumulo Summit 2014: Accumulo with Distributed SQL queriesAccumulo Summit 2014: Accumulo with Distributed SQL queries
Accumulo Summit 2014: Accumulo with Distributed SQL queries
 
Using Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataUsing Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch data
 
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
 
Apache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
 
How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?
 
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
 
Real-time Distributed Stream Processing @ Scale
Real-time Distributed Stream Processing@ ScaleReal-time Distributed Stream Processing@ Scale
Real-time Distributed Stream Processing @ Scale
 
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...
 
Audi‘s Hadoop Journey into the Hybrid Cloud
Audi‘s Hadoop Journey into the Hybrid CloudAudi‘s Hadoop Journey into the Hybrid Cloud
Audi‘s Hadoop Journey into the Hybrid Cloud
 
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...
 
Data to Drive Decision-Making - CaliStream Meetup
Data to Drive Decision-Making - CaliStream MeetupData to Drive Decision-Making - CaliStream Meetup
Data to Drive Decision-Making - CaliStream Meetup
 
Spark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark and Online Analytics: Spark Summit East talky by Shubham ChopraSpark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark and Online Analytics: Spark Summit East talky by Shubham Chopra
 
Real-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in ActionReal-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in Action
 
Data Warehousing Patterns for Hadoop
Data Warehousing Patterns for HadoopData Warehousing Patterns for Hadoop
Data Warehousing Patterns for Hadoop
 
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
 
Hadoop for the Masses
Hadoop for the MassesHadoop for the Masses
Hadoop for the Masses
 
Big Data at Pinterest - Presented by Qubole
Big Data at Pinterest - Presented by QuboleBig Data at Pinterest - Presented by Qubole
Big Data at Pinterest - Presented by Qubole
 
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
 
Spark at Airbnb
Spark at AirbnbSpark at Airbnb
Spark at Airbnb
 
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
 

Similar to Apache Kylin and Use Cases - 2018 Big Data Spain

Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...
Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...
Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...Tyler Wishnoff
 
Augmented OLAP for Big Data Analytics
Augmented OLAP for Big Data AnalyticsAugmented OLAP for Big Data Analytics
Augmented OLAP for Big Data AnalyticsTyler Wishnoff
 
Augmented OLAP Analytics for Big Data
Augmented OLAP Analytics for Big DataAugmented OLAP Analytics for Big Data
Augmented OLAP Analytics for Big DataTyler Wishnoff
 
Augmented OLAP for Big Data
Augmented OLAP for Big DataAugmented OLAP for Big Data
Augmented OLAP for Big DataLuke Han
 
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud Certus Solutions
 
Architecting Snowflake for High Concurrency and High Performance
Architecting Snowflake for High Concurrency and High PerformanceArchitecting Snowflake for High Concurrency and High Performance
Architecting Snowflake for High Concurrency and High PerformanceSamanthaBerlant
 
Cloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeCloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeDatabricks
 
Apache kylin boost your sqls on extremely large dataset
Apache kylin boost your sqls on extremely large datasetApache kylin boost your sqls on extremely large dataset
Apache kylin boost your sqls on extremely large datasetssuser931288
 
Apache kylin boost your SQLs on extremely large dataset
Apache kylin boost your SQLs on extremely large datasetApache kylin boost your SQLs on extremely large dataset
Apache kylin boost your SQLs on extremely large datasetChun'en Ni
 
Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI Holden Ackerman
 
Simplify Data Analytics Over the Cloud
Simplify Data Analytics Over the CloudSimplify Data Analytics Over the Cloud
Simplify Data Analytics Over the CloudTyler Wishnoff
 
Take the Bias out of Big Data Insights With Augmented Analytics
Take the Bias out of Big Data Insights With Augmented AnalyticsTake the Bias out of Big Data Insights With Augmented Analytics
Take the Bias out of Big Data Insights With Augmented AnalyticsTyler Wishnoff
 
Getting Started with Apache Ignite as a Distributed Database
Getting Started with Apache Ignite as a Distributed DatabaseGetting Started with Apache Ignite as a Distributed Database
Getting Started with Apache Ignite as a Distributed DatabaseRoman Shtykh
 
Digital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming EraDigital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming EraAttunity
 
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and QuboleAmazon Web Services
 
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and QuboleAmazon Web Services
 
The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsAerospike, Inc.
 
The Cloud - What's different
The Cloud - What's differentThe Cloud - What's different
The Cloud - What's differentChen-Tien Tsai
 
Kyligence Cloud 4 - An Overview
Kyligence Cloud 4 - An OverviewKyligence Cloud 4 - An Overview
Kyligence Cloud 4 - An OverviewSamanthaBerlant
 
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...Amazon Web Services
 

Similar to Apache Kylin and Use Cases - 2018 Big Data Spain (20)

Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...
Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...
Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...
 
Augmented OLAP for Big Data Analytics
Augmented OLAP for Big Data AnalyticsAugmented OLAP for Big Data Analytics
Augmented OLAP for Big Data Analytics
 
Augmented OLAP Analytics for Big Data
Augmented OLAP Analytics for Big DataAugmented OLAP Analytics for Big Data
Augmented OLAP Analytics for Big Data
 
Augmented OLAP for Big Data
Augmented OLAP for Big DataAugmented OLAP for Big Data
Augmented OLAP for Big Data
 
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud
 
Architecting Snowflake for High Concurrency and High Performance
Architecting Snowflake for High Concurrency and High PerformanceArchitecting Snowflake for High Concurrency and High Performance
Architecting Snowflake for High Concurrency and High Performance
 
Cloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeCloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data Lake
 
Apache kylin boost your sqls on extremely large dataset
Apache kylin boost your sqls on extremely large datasetApache kylin boost your sqls on extremely large dataset
Apache kylin boost your sqls on extremely large dataset
 
Apache kylin boost your SQLs on extremely large dataset
Apache kylin boost your SQLs on extremely large datasetApache kylin boost your SQLs on extremely large dataset
Apache kylin boost your SQLs on extremely large dataset
 
Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI
 
Simplify Data Analytics Over the Cloud
Simplify Data Analytics Over the CloudSimplify Data Analytics Over the Cloud
Simplify Data Analytics Over the Cloud
 
Take the Bias out of Big Data Insights With Augmented Analytics
Take the Bias out of Big Data Insights With Augmented AnalyticsTake the Bias out of Big Data Insights With Augmented Analytics
Take the Bias out of Big Data Insights With Augmented Analytics
 
Getting Started with Apache Ignite as a Distributed Database
Getting Started with Apache Ignite as a Distributed DatabaseGetting Started with Apache Ignite as a Distributed Database
Getting Started with Apache Ignite as a Distributed Database
 
Digital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming EraDigital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming Era
 
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 
The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial Informatics
 
The Cloud - What's different
The Cloud - What's differentThe Cloud - What's different
The Cloud - What's different
 
Kyligence Cloud 4 - An Overview
Kyligence Cloud 4 - An OverviewKyligence Cloud 4 - An Overview
Kyligence Cloud 4 - An Overview
 
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
 

More from Luke Han

The Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke HanThe Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke HanLuke Han
 
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
3. Apache Tez Introducation - Apache Kylin Meetup @ShanghaiLuke Han
 
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @ShanghaiLuke Han
 
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @ShanghaiLuke Han
 
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @ShanghaiLuke Han
 
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...Luke Han
 
Apache Kylin Open Source Journey for QCon2015 Beijing
Apache Kylin Open Source Journey for QCon2015 BeijingApache Kylin Open Source Journey for QCon2015 Beijing
Apache Kylin Open Source Journey for QCon2015 BeijingLuke Han
 
ApacheKylin_HBaseCon2015
ApacheKylin_HBaseCon2015ApacheKylin_HBaseCon2015
ApacheKylin_HBaseCon2015Luke Han
 
Apache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Extreme OLAP Engine for Big DataApache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Extreme OLAP Engine for Big DataLuke Han
 
Apache Kylin Introduction
Apache Kylin IntroductionApache Kylin Introduction
Apache Kylin IntroductionLuke Han
 
Adding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark MeetupAdding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark MeetupLuke Han
 
Apache kylin - Big Data Technology Conference 2014 Beijing
Apache kylin - Big Data Technology Conference 2014 BeijingApache kylin - Big Data Technology Conference 2014 Beijing
Apache kylin - Big Data Technology Conference 2014 BeijingLuke Han
 
Kylin OLAP Engine Tour
Kylin OLAP Engine TourKylin OLAP Engine Tour
Kylin OLAP Engine TourLuke Han
 
Actuate presentation 2011
Actuate presentation   2011Actuate presentation   2011
Actuate presentation 2011Luke Han
 

More from Luke Han (14)

The Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke HanThe Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke Han
 
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
 
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
 
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
 
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
 
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
 
Apache Kylin Open Source Journey for QCon2015 Beijing
Apache Kylin Open Source Journey for QCon2015 BeijingApache Kylin Open Source Journey for QCon2015 Beijing
Apache Kylin Open Source Journey for QCon2015 Beijing
 
ApacheKylin_HBaseCon2015
ApacheKylin_HBaseCon2015ApacheKylin_HBaseCon2015
ApacheKylin_HBaseCon2015
 
Apache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Extreme OLAP Engine for Big DataApache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Extreme OLAP Engine for Big Data
 
Apache Kylin Introduction
Apache Kylin IntroductionApache Kylin Introduction
Apache Kylin Introduction
 
Adding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark MeetupAdding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark Meetup
 
Apache kylin - Big Data Technology Conference 2014 Beijing
Apache kylin - Big Data Technology Conference 2014 BeijingApache kylin - Big Data Technology Conference 2014 Beijing
Apache kylin - Big Data Technology Conference 2014 Beijing
 
Kylin OLAP Engine Tour
Kylin OLAP Engine TourKylin OLAP Engine Tour
Kylin OLAP Engine Tour
 
Actuate presentation 2011
Actuate presentation   2011Actuate presentation   2011
Actuate presentation 2011
 

Recently uploaded

A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Best Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdfBest Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdfIdiosysTechnologies1
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 

Recently uploaded (20)

Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
Best Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdfBest Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdf
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 

Apache Kylin and Use Cases - 2018 Big Data Spain

  • 1. Apache Kylin & Use Cases Luke Han | luke.han@kyligence.io 2018 Big Data Spain
  • 2. Luke Han • Co-founder & CEO at Kyligence • Co-creator and PMC Chairof Apache Kylin • Apache Software FoundationMember • Microsoft RegionalDirector & MVP • Former eBay Big Data Product Manager Lead © Kyligence Inc. 2018. About Luke Han
  • 3. Kyligence = Kylin + Intelligence - Kyligence is formed bythe team who created ApacheKylin, leading opensource OLAP for Big Data. Kyligence provides an intelligent data warehouse built fordata cognitive analytics at web scale. - Funding by leading VCs: - Redpoint Ventures, Cisco, - CBC Capital and Shunwei Capital, - Eight Roads Ventures (Fidelity International Arm) - CRN Top 10 Big Data Startups 2018 © Kyligence Inc. 2018. About Kyligence
  • 5. © Kyligence Inc. 2018. About Apache Kylin • Leading Open Source OLAP for Big Data • Open sourced by eBay in 2014 • Graduated to Apache Top Project in 2015 • 1000+ Adoptions world wild • 2015 InfoWorld Bossie Awards • 2016 InfoWorld Bossie Awards
  • 6. © Kyligence Inc. 2018. 1000+ Global Users Apache Kylin - Leading Open Source OLAP for Big Data
  • 7. © Kyligence Inc. 2018. Presentation Visualization Data Lake Data Source o Too many options o Low performance o Long learning curve o Compatibility issue o Technology vs Data OLAP: The Missing Part of Big Data Hive Impala Spark SQL Drill MapReduce …Spark
  • 8. © Kyligence Inc. 2018. Presentation Visualization Data Lake Data Source o SQL Acceleration for Big Data o Semantic Layer o Speed up Analytics o ANSI SQL Interface o High Performance and High Concurrency Apache Kylin: Bring OLAP back to Big Data OLAP Data Mart Hive Impala Spark SQL Drill MapReduce …Spark
  • 10. © Kyligence Inc. 2018. OLAP and OLAP Cube Online analytical processing, or OLAP, is an approach to answering multi- dimensional analytical (MDA) queries swiftly in computing. – Wikipedia Basic operations – Roll-up – Drill-down – Slice and dice – Pivot OLAP cube is a data structure optimized for very quick data analysis.
  • 11. © Kyligence Inc. 2018. Cube: balance between space and time OLAP Cube --Key-Value Multiple Dimensional Model --Relational Classification, aggregation, and sorting
  • 12. © Kyligence Inc. 2018. Apache Kylin Architecture Overview Apache Kylin Data Analyst, BI Tools, Web App… SQL Online calculation Offline calculation Scan & filter Extract Compute Load Optimize & Rewrite
  • 13. © Kyligence Inc. 2018. SQL execution plan without Cube select l_returnflag, o_orderstatus, sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price from v_lineitem inner join v_orders on l_orderkey = o_orderkey where l_shipdate <= '1998-09-16' group by l_returnflag, o_orderstatus order by l_returnflag, o_orderstatus; Sample:Check the order return and order status relationship in a time range Sort Aggr. Filter Tables O(N) Join No cube, all need online calculations, CPU and IO intensive, latency is remarkable.
  • 14. © Kyligence Inc. 2018. SQL execution plan with Cube Cube technology speed up query performance with pre-calculation Sort Cube Filter Sort Aggr. Filter Tables O(N) Join O(flag x status x days) = O(1) Aggregated data The table join and aggregation are completed offline. Directly from aggregated data (cube) with index; Much less CPU and IO. Latency is small.
  • 15. © Kyligence Inc. 2018. ORDERS CUSTOMER SUPPLIER PART LINEITEM PARTSUPP NATION REGION Join Join Join Join Join ORDERS CUSTOMER PART LINEITEM PARTSUPP Join Join Join All rights reserved ©Kyligence Inc. http://kyligence.io Multidimensional Schema Apach Kylin supports Star-Schema, Snowflake-Schema
  • 16. © Kyligence Inc. 2018. Persistent the cube in HBase Relational to Key Value store
  • 17. © Kyligence Inc. 2018. How to query the cube Translate cube query into HBase table scan – Columns, Group by → Cuboid ID – Filters -> Scan Range (Row Key) – Aggregations -> Measure Columns (Row Values) Scan HBase table and translate HBase result into cube result – HBase Result (key + value) -> Cube Result (dimensions + measures) No Hive touch, no MapReduce job in the query time
  • 18. © Kyligence Inc. 2018. High performance & High concurrency together Sub-second latency on PB scale dataset Star schema benchmark: http://www.cs.umb.edu/~poneil/StarSchemaB.PDF SQL Latency Lower is better Data VolumeScale Lower is better
  • 19. © Kyligence Inc. 2018. Seamless integration with BI tools From open source to commercial BI
  • 21. © Kyligence Inc. 2018. Apache Kylin Use Cases Solution • Behavior Analytics • LogAnalysis • Data Mart/DW • Self-service Data Service • Retail Analytics • Financial Asset • Advertising Analytics • Real-time Analytics • Gaming Analysis Apache Kylin fits various scenarios 1000+ adoptions all of the world
  • 22. © Kyligence Inc. 2018. Use Case – Insight on Trillion Data Top1 news feed app in China
  • 23. © Kyligence Inc. 2018. Use Case: PB-level Analytics Platform Cube Storage: 971TB (almost PB) Cube numbers: 973 Cube Data Records: 8.9 Trillion rows 90%ile latency: <1.2s Frequency: 3.8 million queries / day Top O2O services provider in China Supporting all critical business lines including E-Takeaways, Hotel, Movie, LBS, Tickets… Latest updated -201808
  • 24. © Kyligence Inc. 2018. Use Case: Online Shopping Reporting https://techblog.yahoo.co.jp/oss/apache-kylin/ ▪ Our reporting system used Impala as a backend database previously. - It took a long time (about 60 sec) to show Web UI. ▪ In order to lower the latency, we moved to Apache Kylin. - Average latency < 1sec for most cases ▪ Thanks to low latency with Kylin, we become possible to focus on adding functions for users. ▪ We provide a reporting system that show statistics for store owners. - e. g. impressions, clicks and sales. The most visited website in Japan Yahoo! Japan
  • 25. © Kyligence Inc. 2018. Use Case: Data Factory for Business • Serving 18 business lines as the engine for mi’s“data factory” • Daily incremental 17 billion • 95% queries < 500ms. Leading smart phone and smart device manufacture
  • 26. © Kyligence Inc. 2018. The data platform based on Apache Kylin solved the problem of massive user queries excellently. -- Chase Zhang, Data Platform Engineer of Strikingly Performance • Use Apache Kylin to speedup analytics with Keen.io, and support high concurrency Containerizing • Apache Kylin runs on AWS ECS Integration • Developed a scheduler systemto manage all kinds of jobs Use Case – Website traffic Analytics A company to provide convenient and one stop website building solutions.
  • 28. © Kyligence Inc. 2018. Apache Kylin Roadmap • New storage support –Parquet • Real-time support • Containerization From the community
  • 30. Traditional Data Warehousing EnormousManual Effortsand Repeated Work © Kyligence Inc. 2018, Confidential.
  • 31. Human Intelligence Intelligence and Automation The future of DataAnalytics Artificial IntelligenceVS
  • 32. Augmented Analytics Platform SQL Query Log Analytic Behavior Data Schema Data Profile ML-based Discoveryof Analytic Pattern ProprietaryData Modeling Automation Self-directed Storage Layer Optimization Intelligent QueryPush- down &Routing BI Real-time Analysis Data-as-a- Service Local Deployment Cloud Platform Container Data Services © Kyligence Inc. 2018, Confidential.
  • 33. Machine Learning Augmented Analytics Available from Kyligence 3.x http://kyligence.io
  • 34. © Kyligence Inc. 2018. Kyligence Cloud Transforming Big Data Analytics to Cloud Kyligence Cloud ANSI SQL Dashboard OLAP Hadoop Customer Cloud Account client cloud Kyligence Enterprise Platform streaming Cluster Deploy Account Management Diagnosis & Optimization Queries & Reporting cloud storage tables, logs, files RDBMS (metadata) ANSI SQL Cloud Data Warehouse Cluster Management
  • 35. © Kyligence Inc. 2018. Kyligence Cloud Available: AWS, Azure, Google Cloud, Alibaba Cloud , Huawei Cloud One-click provisioning Auto Scaling High Performance Seamless Integration Intelligent Ops Deploy globally in 30 minutes Scale cluster automatically for different workloads Powered by Kyligence Analytics Platform Connect to cloud data sources Enterprise ODBC driver for BI Online diagnosis and continuous optimization Speed Upmission-critical analytics in the cloud
  • 36. © Kyligence Inc. 2018. Use Case : Replaced IBM Cognos 1 Kyligence cube replaced 800+ IBM Cognos cubes PB level (300B records) big data warehouse of both self-service aggregation query and raw data query by business analysts Self-Service Big Data Warehouse Efficient IT Operation Significantly increase IT operation efficiency as 1 Kyligence cube replacing 800 Cognos cubes with unified data access management Kyligence scale-out architecture provide best flexibility for IT infrastructure when faced with increasing analytics and concurrency demands Better flexibility of Architecture Support analysis on high granularity dimensions such as Merchant (10M cardinality) and Card (10B cardinality) Merchant or Card Multi-dimensional Analytics
  • 37. © Kyligence Inc. 2018. Use Case: Customer 360 for FMCG Azure + Kyligence ➢ 360 degree view of user profile. ➢ Powering analysts insight into data without IT ➢ HDInsight + Kyligence + Power BI
  • 38. © Kyligence Inc. 2018. Global Partners Kyligence Open Ecosystem Microsoft Azure Partner AWS Technology Partner Tableau Technology Partner Cloudera Sliver Partner MapR Converge Partner Hortonworks Community Partner Huawei Solution Partner