Apache Kylin and Use Cases - 2018 Big Data Spain

Luke Han
Luke HanCo-Founder & CEO at Kyligence Inc. em Kyligence
Apache Kylin & Use Cases
Luke Han | luke.han@kyligence.io
2018 Big Data Spain
Luke Han
• Co-founder & CEO at Kyligence
• Co-creator and PMC Chairof Apache Kylin
• Apache Software FoundationMember
• Microsoft RegionalDirector & MVP
• Former eBay Big Data Product Manager Lead
© Kyligence Inc. 2018.
About Luke Han
Kyligence = Kylin + Intelligence
- Kyligence is formed bythe team who created ApacheKylin, leading opensource OLAP for Big
Data. Kyligence provides an intelligent data warehouse built fordata cognitive analytics at web
scale.
- Funding by leading VCs:
- Redpoint Ventures, Cisco,
- CBC Capital and Shunwei Capital,
- Eight Roads Ventures (Fidelity International Arm)
- CRN Top 10 Big Data Startups 2018
© Kyligence Inc. 2018.
About Kyligence
Apache Kylin
© Kyligence Inc. 2018.
About Apache Kylin
• Leading Open Source OLAP for Big Data
• Open sourced by eBay in 2014
• Graduated to Apache Top Project in 2015
• 1000+ Adoptions world wild
• 2015 InfoWorld Bossie Awards
• 2016 InfoWorld Bossie Awards
© Kyligence Inc. 2018.
1000+ Global Users
Apache Kylin - Leading Open Source OLAP for Big Data
© Kyligence Inc. 2018.
Presentation
Visualization
Data
Lake
Data
Source
o Too many options
o Low performance
o Long learning curve
o Compatibility issue
o Technology vs Data
OLAP: The Missing Part of Big Data
Hive Impala Spark
SQL
Drill
MapReduce …Spark
© Kyligence Inc. 2018.
Presentation
Visualization
Data
Lake
Data
Source
o SQL Acceleration for Big Data
o Semantic Layer
o Speed up Analytics
o ANSI SQL Interface
o High Performance and High
Concurrency
Apache Kylin: Bring OLAP back to Big Data
OLAP
Data Mart
Hive Impala Spark SQL Drill
MapReduce …Spark
Apache Kylin
Technical Highlights
© Kyligence Inc. 2018.
OLAP and OLAP Cube
Online analytical processing, or OLAP,
is an approach to answering multi-
dimensional analytical (MDA) queries
swiftly in computing. – Wikipedia
Basic operations
– Roll-up
– Drill-down
– Slice and dice
– Pivot
OLAP cube is a data
structure optimized for
very quick data analysis.
© Kyligence Inc. 2018.
Cube: balance between space and time
OLAP Cube
--Key-Value
Multiple Dimensional Model
--Relational
Classification,
aggregation, and
sorting
© Kyligence Inc. 2018.
Apache Kylin Architecture Overview
Apache Kylin
Data Analyst, BI Tools, Web App…
SQL
Online calculation
Offline calculation
Scan & filter
Extract
Compute
Load
Optimize & Rewrite
© Kyligence Inc. 2018.
SQL execution plan without Cube
select
l_returnflag,
o_orderstatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price
from
v_lineitem
inner join
v_orders on l_orderkey = o_orderkey
where
l_shipdate <= '1998-09-16'
group by
l_returnflag,
o_orderstatus
order by
l_returnflag,
o_orderstatus;
Sample:Check the order return and order status relationship in a time range
Sort
Aggr.
Filter
Tables
O(N)
Join
No cube, all need online
calculations, CPU and IO
intensive, latency is
remarkable.
© Kyligence Inc. 2018.
SQL execution plan with Cube
Cube technology speed up query performance with pre-calculation
Sort
Cube
Filter
Sort
Aggr.
Filter
Tables
O(N)
Join
O(flag x status x days) = O(1)
Aggregated data
The table join
and aggregation
are completed
offline.
Directly from aggregated
data (cube) with index;
Much less CPU and IO.
Latency is small.
© Kyligence Inc. 2018.
ORDERS
CUSTOMER
SUPPLIER
PART
LINEITEM
PARTSUPP
NATION
REGION
Join
Join
Join
Join
Join
ORDERS
CUSTOMER
PART
LINEITEM
PARTSUPP
Join
Join
Join
All rights reserved ©Kyligence Inc.
http://kyligence.io
Multidimensional Schema
Apach Kylin supports Star-Schema, Snowflake-Schema
© Kyligence Inc. 2018.
Persistent the cube in HBase
Relational to Key Value store
© Kyligence Inc. 2018.
How to query the cube
Translate cube query into HBase table scan
– Columns, Group by → Cuboid ID
– Filters -> Scan Range (Row Key)
– Aggregations -> Measure Columns (Row Values)
Scan HBase table and translate HBase result into cube result
– HBase Result (key + value) -> Cube Result (dimensions +
measures)
No Hive touch, no MapReduce job in the query time
© Kyligence Inc. 2018.
High performance & High concurrency together
Sub-second latency on PB scale dataset
Star schema benchmark:
http://www.cs.umb.edu/~poneil/StarSchemaB.PDF
SQL Latency
Lower is better
Data VolumeScale
Lower is better
© Kyligence Inc. 2018.
Seamless integration with BI tools
From open source to commercial BI
Apache Kylin
Use Cases
© Kyligence Inc. 2018.
Apache Kylin Use Cases
Solution
• Behavior Analytics
• LogAnalysis
• Data Mart/DW
• Self-service Data Service
• Retail Analytics
• Financial Asset
• Advertising Analytics
• Real-time Analytics
• Gaming Analysis
Apache Kylin fits various scenarios
1000+ adoptions all of the world
© Kyligence Inc. 2018.
Use Case – Insight on Trillion Data
Top1 news feed app in China
© Kyligence Inc. 2018.
Use Case: PB-level Analytics Platform
Cube Storage: 971TB (almost PB)
Cube numbers: 973 Cube
Data Records: 8.9 Trillion rows
90%ile latency: <1.2s
Frequency: 3.8 million queries / day
Top O2O services provider in China
Supporting all critical business
lines including E-Takeaways,
Hotel, Movie, LBS, Tickets…
Latest updated -201808
© Kyligence Inc. 2018.
Use Case: Online Shopping Reporting
https://techblog.yahoo.co.jp/oss/apache-kylin/
▪ Our reporting system used Impala as a backend database
previously.
- It took a long time (about 60 sec) to show Web UI.
▪ In order to lower the latency, we moved to Apache Kylin.
- Average latency < 1sec for most cases
▪ Thanks to low latency with Kylin, we become possible
to focus on adding functions for users.
▪ We provide a reporting system that show statistics
for store owners.
- e. g. impressions, clicks and sales.
The most visited website in Japan
Yahoo! Japan
© Kyligence Inc. 2018.
Use Case: Data Factory for Business
• Serving 18 business lines as the engine for mi’s“data factory”
• Daily incremental 17 billion
• 95% queries < 500ms.
Leading smart phone and smart device manufacture
© Kyligence Inc. 2018.
The data platform based on Apache Kylin solved the problem of massive user
queries excellently.
-- Chase Zhang, Data Platform Engineer of Strikingly
Performance
• Use Apache Kylin to speedup analytics
with Keen.io, and support high
concurrency
Containerizing
• Apache Kylin runs on AWS ECS
Integration
• Developed a scheduler systemto
manage all kinds of jobs
Use Case – Website traffic Analytics
A company to provide convenient and one stop website building solutions.
Apache Kylin
Roadmap
© Kyligence Inc. 2018.
Apache Kylin Roadmap
• New storage support
–Parquet
• Real-time support
• Containerization
From the community
About Kyligence
Traditional Data Warehousing
EnormousManual Effortsand Repeated Work
© Kyligence Inc. 2018, Confidential.
Human
Intelligence
Intelligence and Automation
The future of DataAnalytics
Artificial
IntelligenceVS
Augmented Analytics Platform
SQL
Query Log
Analytic
Behavior
Data
Schema
Data
Profile
ML-based
Discoveryof
Analytic Pattern
ProprietaryData
Modeling
Automation
Self-directed
Storage Layer
Optimization
Intelligent
QueryPush-
down &Routing
BI
Real-time
Analysis
Data-as-a-
Service
Local
Deployment
Cloud
Platform
Container
Data
Services
© Kyligence Inc. 2018, Confidential.
Machine Learning
Augmented Analytics
Available from Kyligence 3.x
http://kyligence.io
© Kyligence Inc. 2018.
Kyligence Cloud
Transforming Big Data Analytics to Cloud
Kyligence Cloud
ANSI SQL
Dashboard OLAP
Hadoop
Customer Cloud Account
client
cloud
Kyligence Enterprise Platform
streaming
Cluster Deploy
Account Management
Diagnosis &
Optimization
Queries & Reporting
cloud
storage
tables, logs, files
RDBMS
(metadata)
ANSI SQL
Cloud Data
Warehouse
Cluster Management
© Kyligence Inc. 2018.
Kyligence Cloud
Available: AWS, Azure, Google Cloud, Alibaba Cloud , Huawei Cloud
One-click
provisioning
Auto Scaling
High
Performance
Seamless
Integration
Intelligent
Ops
Deploy globally in 30
minutes
Scale cluster
automatically for
different workloads
Powered by Kyligence
Analytics Platform
Connect to cloud data
sources
Enterprise ODBC driver
for BI
Online diagnosis and
continuous
optimization
Speed Upmission-critical analytics in the cloud
© Kyligence Inc. 2018.
Use Case : Replaced IBM Cognos
1 Kyligence cube replaced 800+ IBM Cognos cubes
PB level (300B records)
big data warehouse of both
self-service aggregation
query and raw data query by
business analysts
Self-Service
Big Data Warehouse
Efficient
IT Operation
Significantly increase IT
operation efficiency
as 1 Kyligence cube
replacing 800 Cognos
cubes with unified data
access management
Kyligence scale-out
architecture provide best
flexibility for IT infrastructure
when faced with increasing
analytics and concurrency
demands
Better flexibility
of Architecture
Support analysis on high
granularity dimensions such
as Merchant (10M
cardinality) and Card (10B
cardinality)
Merchant or Card
Multi-dimensional Analytics
© Kyligence Inc. 2018.
Use Case: Customer 360 for FMCG
Azure + Kyligence
➢ 360 degree view of user profile.
➢ Powering analysts insight into
data without IT
➢ HDInsight + Kyligence + Power BI
© Kyligence Inc. 2018.
Global Partners
Kyligence Open Ecosystem
Microsoft Azure Partner
AWS Technology Partner
Tableau Technology Partner
Cloudera Sliver Partner
MapR Converge Partner
Hortonworks Community Partner
Huawei Solution Partner
Q & A
luke.han@kyligence.io
1 de 39

Recomendados

Refactoring your EDW with Mobile Analytics Products por
Refactoring your EDW with Mobile Analytics ProductsRefactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics ProductsLuke Han
312 visualizações48 slides
Accelerating Big Data Analytics with Apache Kylin por
Accelerating Big Data Analytics with Apache KylinAccelerating Big Data Analytics with Apache Kylin
Accelerating Big Data Analytics with Apache KylinTyler Wishnoff
1.1K visualizações28 slides
Apache Kylin Use Cases in China and Japan por
Apache Kylin Use Cases in China and JapanApache Kylin Use Cases in China and Japan
Apache Kylin Use Cases in China and JapanLuke Han
1.2K visualizações22 slides
Building Enterprise OLAP on Hadoop for FSI por
Building Enterprise OLAP on Hadoop for FSIBuilding Enterprise OLAP on Hadoop for FSI
Building Enterprise OLAP on Hadoop for FSILuke Han
981 visualizações30 slides
Apache Kylin on HBase: Extreme OLAP engine for big data por
Apache Kylin on HBase: Extreme OLAP engine for big dataApache Kylin on HBase: Extreme OLAP engine for big data
Apache Kylin on HBase: Extreme OLAP engine for big dataShi Shao Feng
1.6K visualizações25 slides
Apache Kylin’s Performance Boost from Apache HBase por
Apache Kylin’s Performance Boost from Apache HBaseApache Kylin’s Performance Boost from Apache HBase
Apache Kylin’s Performance Boost from Apache HBaseHBaseCon
3.5K visualizações21 slides

Mais conteúdo relacionado

Mais procurados

Accumulo Summit 2014: Accumulo with Distributed SQL queries por
Accumulo Summit 2014: Accumulo with Distributed SQL queriesAccumulo Summit 2014: Accumulo with Distributed SQL queries
Accumulo Summit 2014: Accumulo with Distributed SQL queriesAccumulo Summit
1.1K visualizações17 slides
Using Hadoop to build a Data Quality Service for both real-time and batch data por
Using Hadoop to build a Data Quality Service for both real-time and batch dataUsing Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataDataWorks Summit/Hadoop Summit
6.8K visualizações22 slides
Spark and Hadoop at Production Scale-(Anil Gadre, MapR) por
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark Summit
4.6K visualizações17 slides
Apache Spark in Scientific Applciations por
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsDr. Mirko Kämpf
380 visualizações39 slides
How Workato creates robust data pipelines and automations for you? por
How Workato creates robust data pipelines and automations for you?How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?Jeraldine Phneah
182 visualizações4 slides
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks) por
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)Spark Summit
4.3K visualizações25 slides

Mais procurados(20)

Accumulo Summit 2014: Accumulo with Distributed SQL queries por Accumulo Summit
Accumulo Summit 2014: Accumulo with Distributed SQL queriesAccumulo Summit 2014: Accumulo with Distributed SQL queries
Accumulo Summit 2014: Accumulo with Distributed SQL queries
Accumulo Summit1.1K visualizações
Using Hadoop to build a Data Quality Service for both real-time and batch data por DataWorks Summit/Hadoop Summit
Using Hadoop to build a Data Quality Service for both real-time and batch dataUsing Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch data
DataWorks Summit/Hadoop Summit6.8K visualizações
Spark and Hadoop at Production Scale-(Anil Gadre, MapR) por Spark Summit
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark Summit4.6K visualizações
Apache Spark in Scientific Applciations por Dr. Mirko Kämpf
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
Dr. Mirko Kämpf380 visualizações
How Workato creates robust data pipelines and automations for you? por Jeraldine Phneah
How Workato creates robust data pipelines and automations for you?How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?
Jeraldine Phneah182 visualizações
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks) por Spark Summit
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Spark Summit4.3K visualizações
Real-time Distributed Stream Processing @ Scale por Jerome Boulon
Real-time Distributed Stream Processing@ ScaleReal-time Distributed Stream Processing@ Scale
Real-time Distributed Stream Processing @ Scale
Jerome Boulon973 visualizações
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo... por Spark Summit
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...
The Little Warehouse That Couldn't Or: How We Learned to Stop Worrying and Mo...
Spark Summit2.5K visualizações
Audi‘s Hadoop Journey into the Hybrid Cloud por DataWorks Summit
Audi‘s Hadoop Journey into the Hybrid CloudAudi‘s Hadoop Journey into the Hybrid Cloud
Audi‘s Hadoop Journey into the Hybrid Cloud
DataWorks Summit602 visualizações
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie... por Cloudera, Inc.
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...
Cloudera, Inc.1K visualizações
Data to Drive Decision-Making - CaliStream Meetup por Jerome Boulon
Data to Drive Decision-Making - CaliStream MeetupData to Drive Decision-Making - CaliStream Meetup
Data to Drive Decision-Making - CaliStream Meetup
Jerome Boulon1.4K visualizações
Spark and Online Analytics: Spark Summit East talky by Shubham Chopra por Spark Summit
Spark and Online Analytics: Spark Summit East talky by Shubham ChopraSpark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark Summit1.1K visualizações
Real-Time Robot Predictive Maintenance in Action por DataWorks Summit
Real-Time Robot Predictive Maintenance in ActionReal-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in Action
DataWorks Summit2.2K visualizações
Data Warehousing Patterns for Hadoop por Michelle Ufford
Data Warehousing Patterns for HadoopData Warehousing Patterns for Hadoop
Data Warehousing Patterns for Hadoop
Michelle Ufford1.2K visualizações
Presto & differences between popular SQL engines (Spark, Redshift, and Hive) por Holden Ackerman
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Holden Ackerman3.2K visualizações
Big Data at Pinterest - Presented by Qubole por Qubole
Big Data at Pinterest - Presented by QuboleBig Data at Pinterest - Presented by Qubole
Big Data at Pinterest - Presented by Qubole
Qubole1.5K visualizações
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code... por Codemotion
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
Codemotion505 visualizações
Spark at Airbnb por Hao Wang
Spark at AirbnbSpark at Airbnb
Spark at Airbnb
Hao Wang544 visualizações
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O... por Qubole
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Qubole757 visualizações

Similar a Apache Kylin and Use Cases - 2018 Big Data Spain

Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat... por
Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...
Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...Tyler Wishnoff
190 visualizações18 slides
Augmented OLAP for Big Data Analytics por
Augmented OLAP for Big Data AnalyticsAugmented OLAP for Big Data Analytics
Augmented OLAP for Big Data AnalyticsTyler Wishnoff
52 visualizações27 slides
Augmented OLAP for Big Data por
Augmented OLAP for Big DataAugmented OLAP for Big Data
Augmented OLAP for Big DataLuke Han
10.6K visualizações37 slides
Augmented OLAP Analytics for Big Data por
Augmented OLAP Analytics for Big DataAugmented OLAP Analytics for Big Data
Augmented OLAP Analytics for Big DataTyler Wishnoff
118 visualizações37 slides
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud por
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud Certus Solutions
392 visualizações30 slides
Architecting Snowflake for High Concurrency and High Performance por
Architecting Snowflake for High Concurrency and High PerformanceArchitecting Snowflake for High Concurrency and High Performance
Architecting Snowflake for High Concurrency and High PerformanceSamanthaBerlant
100 visualizações29 slides

Similar a Apache Kylin and Use Cases - 2018 Big Data Spain(20)

Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat... por Tyler Wishnoff
Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...
Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...
Tyler Wishnoff190 visualizações
Augmented OLAP for Big Data Analytics por Tyler Wishnoff
Augmented OLAP for Big Data AnalyticsAugmented OLAP for Big Data Analytics
Augmented OLAP for Big Data Analytics
Tyler Wishnoff52 visualizações
Augmented OLAP for Big Data por Luke Han
Augmented OLAP for Big DataAugmented OLAP for Big Data
Augmented OLAP for Big Data
Luke Han10.6K visualizações
Augmented OLAP Analytics for Big Data por Tyler Wishnoff
Augmented OLAP Analytics for Big DataAugmented OLAP Analytics for Big Data
Augmented OLAP Analytics for Big Data
Tyler Wishnoff118 visualizações
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud por Certus Solutions
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud
Certus Solutions392 visualizações
Architecting Snowflake for High Concurrency and High Performance por SamanthaBerlant
Architecting Snowflake for High Concurrency and High PerformanceArchitecting Snowflake for High Concurrency and High Performance
Architecting Snowflake for High Concurrency and High Performance
SamanthaBerlant100 visualizações
Cloud-native Semantic Layer on Data Lake por Databricks
Cloud-native Semantic Layer on Data LakeCloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data Lake
Databricks684 visualizações
Apache kylin boost your sqls on extremely large dataset por ssuser931288
Apache kylin boost your sqls on extremely large datasetApache kylin boost your sqls on extremely large dataset
Apache kylin boost your sqls on extremely large dataset
ssuser93128857 visualizações
Apache kylin boost your SQLs on extremely large dataset por Chun'en Ni
Apache kylin boost your SQLs on extremely large datasetApache kylin boost your SQLs on extremely large dataset
Apache kylin boost your SQLs on extremely large dataset
Chun'en Ni82 visualizações
Top Trends in Building Data Lakes for Machine Learning and AI por Holden Ackerman
Top Trends in Building Data Lakes for Machine Learning and AI Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI
Holden Ackerman1.6K visualizações
Simplify Data Analytics Over the Cloud por Tyler Wishnoff
Simplify Data Analytics Over the CloudSimplify Data Analytics Over the Cloud
Simplify Data Analytics Over the Cloud
Tyler Wishnoff264 visualizações
Take the Bias out of Big Data Insights With Augmented Analytics por Tyler Wishnoff
Take the Bias out of Big Data Insights With Augmented AnalyticsTake the Bias out of Big Data Insights With Augmented Analytics
Take the Bias out of Big Data Insights With Augmented Analytics
Tyler Wishnoff196 visualizações
Getting Started with Apache Ignite as a Distributed Database por Roman Shtykh
Getting Started with Apache Ignite as a Distributed DatabaseGetting Started with Apache Ignite as a Distributed Database
Getting Started with Apache Ignite as a Distributed Database
Roman Shtykh319 visualizações
Data insights for breakfast, stockholm por Solita Oy
Data insights for breakfast, stockholmData insights for breakfast, stockholm
Data insights for breakfast, stockholm
Solita Oy919 visualizações
Data Insights for Breakfast, Malmö por Solita Oy
Data Insights for Breakfast, MalmöData Insights for Breakfast, Malmö
Data Insights for Breakfast, Malmö
Solita Oy756 visualizações
Digital Business Transformation in the Streaming Era por Attunity
Digital Business Transformation in the Streaming EraDigital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming Era
Attunity520 visualizações
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole por Amazon Web Services
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
Amazon Web Services1.6K visualizações
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole por Amazon Web Services
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
Amazon Web Services764 visualizações
The role of NoSQL in the Next Generation of Financial Informatics por Aerospike, Inc.
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial Informatics
Aerospike, Inc. 943 visualizações
The Cloud - What's different por Chen-Tien Tsai
The Cloud - What's differentThe Cloud - What's different
The Cloud - What's different
Chen-Tien Tsai493 visualizações

Mais de Luke Han

The Evolution of Apache Kylin by Luke Han por
The Evolution of Apache Kylin by Luke HanThe Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke HanLuke Han
1.5K visualizações29 slides
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai por
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
3. Apache Tez Introducation - Apache Kylin Meetup @ShanghaiLuke Han
970 visualizações24 slides
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai por
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @ShanghaiLuke Han
3.7K visualizações16 slides
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai por
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @ShanghaiLuke Han
1.6K visualizações10 slides
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai por
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @ShanghaiLuke Han
4.2K visualizações63 slides
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ... por
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...Luke Han
3.1K visualizações37 slides

Mais de Luke Han(14)

The Evolution of Apache Kylin by Luke Han por Luke Han
The Evolution of Apache Kylin by Luke HanThe Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke Han
Luke Han1.5K visualizações
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai por Luke Han
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
Luke Han970 visualizações
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai por Luke Han
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
Luke Han3.7K visualizações
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai por Luke Han
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
Luke Han1.6K visualizações
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai por Luke Han
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
Luke Han4.2K visualizações
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ... por Luke Han
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
Luke Han3.1K visualizações
Apache Kylin Open Source Journey for QCon2015 Beijing por Luke Han
Apache Kylin Open Source Journey for QCon2015 BeijingApache Kylin Open Source Journey for QCon2015 Beijing
Apache Kylin Open Source Journey for QCon2015 Beijing
Luke Han1.2K visualizações
ApacheKylin_HBaseCon2015 por Luke Han
ApacheKylin_HBaseCon2015ApacheKylin_HBaseCon2015
ApacheKylin_HBaseCon2015
Luke Han433 visualizações
Apache Kylin Extreme OLAP Engine for Big Data por Luke Han
Apache Kylin Extreme OLAP Engine for Big DataApache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Extreme OLAP Engine for Big Data
Luke Han2.9K visualizações
Apache Kylin Introduction por Luke Han
Apache Kylin IntroductionApache Kylin Introduction
Apache Kylin Introduction
Luke Han1.8K visualizações
Adding Spark support to Kylin at Bay Area Spark Meetup por Luke Han
Adding Spark support to Kylin at Bay Area Spark MeetupAdding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark Meetup
Luke Han1.4K visualizações
Apache kylin - Big Data Technology Conference 2014 Beijing por Luke Han
Apache kylin - Big Data Technology Conference 2014 BeijingApache kylin - Big Data Technology Conference 2014 Beijing
Apache kylin - Big Data Technology Conference 2014 Beijing
Luke Han2.1K visualizações
Kylin OLAP Engine Tour por Luke Han
Kylin OLAP Engine TourKylin OLAP Engine Tour
Kylin OLAP Engine Tour
Luke Han5.8K visualizações
Actuate presentation 2011 por Luke Han
Actuate presentation   2011Actuate presentation   2011
Actuate presentation 2011
Luke Han1.2K visualizações

Último

DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J... por
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...Deltares
9 visualizações24 slides
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols por
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - DolsDSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - DolsDeltares
7 visualizações23 slides
Unleash The Monkeys por
Unleash The MonkeysUnleash The Monkeys
Unleash The MonkeysJacob Duijzer
7 visualizações28 slides
Dapr Unleashed: Accelerating Microservice Development por
Dapr Unleashed: Accelerating Microservice DevelopmentDapr Unleashed: Accelerating Microservice Development
Dapr Unleashed: Accelerating Microservice DevelopmentMiroslav Janeski
10 visualizações29 slides
DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft... por
DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft...DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft...
DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft...Deltares
7 visualizações18 slides
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -... por
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...Deltares
6 visualizações15 slides

Último(20)

DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J... por Deltares
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...
Deltares9 visualizações
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols por Deltares
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - DolsDSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols
Deltares7 visualizações
Unleash The Monkeys por Jacob Duijzer
Unleash The MonkeysUnleash The Monkeys
Unleash The Monkeys
Jacob Duijzer7 visualizações
Dapr Unleashed: Accelerating Microservice Development por Miroslav Janeski
Dapr Unleashed: Accelerating Microservice DevelopmentDapr Unleashed: Accelerating Microservice Development
Dapr Unleashed: Accelerating Microservice Development
Miroslav Janeski10 visualizações
DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft... por Deltares
DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft...DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft...
DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft...
Deltares7 visualizações
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -... por Deltares
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...
Deltares6 visualizações
AI and Ml presentation .pptx por FayazAli87
AI and Ml presentation .pptxAI and Ml presentation .pptx
AI and Ml presentation .pptx
FayazAli8711 visualizações
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI... por Marc Müller
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...
Marc Müller37 visualizações
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx por animuscrm
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
animuscrm14 visualizações
DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h... por Deltares
DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h...DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h...
DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h...
Deltares5 visualizações
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge... por Deltares
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...
Deltares17 visualizações
DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the... por Deltares
DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the...DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the...
DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the...
Deltares6 visualizações
Airline Booking Software por SharmiMehta
Airline Booking SoftwareAirline Booking Software
Airline Booking Software
SharmiMehta5 visualizações
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme... por Deltares
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...
Deltares5 visualizações
Copilot Prompting Toolkit_All Resources.pdf por Riccardo Zamana
Copilot Prompting Toolkit_All Resources.pdfCopilot Prompting Toolkit_All Resources.pdf
Copilot Prompting Toolkit_All Resources.pdf
Riccardo Zamana8 visualizações
Keep por Geniusee
KeepKeep
Keep
Geniusee75 visualizações
Sprint 226 por ManageIQ
Sprint 226Sprint 226
Sprint 226
ManageIQ5 visualizações
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut... por Deltares
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...
Deltares7 visualizações
SUGCON ANZ Presentation V2.1 Final.pptx por Jack Spektor
SUGCON ANZ Presentation V2.1 Final.pptxSUGCON ANZ Presentation V2.1 Final.pptx
SUGCON ANZ Presentation V2.1 Final.pptx
Jack Spektor22 visualizações

Apache Kylin and Use Cases - 2018 Big Data Spain

  • 1. Apache Kylin & Use Cases Luke Han | luke.han@kyligence.io 2018 Big Data Spain
  • 2. Luke Han • Co-founder & CEO at Kyligence • Co-creator and PMC Chairof Apache Kylin • Apache Software FoundationMember • Microsoft RegionalDirector & MVP • Former eBay Big Data Product Manager Lead © Kyligence Inc. 2018. About Luke Han
  • 3. Kyligence = Kylin + Intelligence - Kyligence is formed bythe team who created ApacheKylin, leading opensource OLAP for Big Data. Kyligence provides an intelligent data warehouse built fordata cognitive analytics at web scale. - Funding by leading VCs: - Redpoint Ventures, Cisco, - CBC Capital and Shunwei Capital, - Eight Roads Ventures (Fidelity International Arm) - CRN Top 10 Big Data Startups 2018 © Kyligence Inc. 2018. About Kyligence
  • 5. © Kyligence Inc. 2018. About Apache Kylin • Leading Open Source OLAP for Big Data • Open sourced by eBay in 2014 • Graduated to Apache Top Project in 2015 • 1000+ Adoptions world wild • 2015 InfoWorld Bossie Awards • 2016 InfoWorld Bossie Awards
  • 6. © Kyligence Inc. 2018. 1000+ Global Users Apache Kylin - Leading Open Source OLAP for Big Data
  • 7. © Kyligence Inc. 2018. Presentation Visualization Data Lake Data Source o Too many options o Low performance o Long learning curve o Compatibility issue o Technology vs Data OLAP: The Missing Part of Big Data Hive Impala Spark SQL Drill MapReduce …Spark
  • 8. © Kyligence Inc. 2018. Presentation Visualization Data Lake Data Source o SQL Acceleration for Big Data o Semantic Layer o Speed up Analytics o ANSI SQL Interface o High Performance and High Concurrency Apache Kylin: Bring OLAP back to Big Data OLAP Data Mart Hive Impala Spark SQL Drill MapReduce …Spark
  • 10. © Kyligence Inc. 2018. OLAP and OLAP Cube Online analytical processing, or OLAP, is an approach to answering multi- dimensional analytical (MDA) queries swiftly in computing. – Wikipedia Basic operations – Roll-up – Drill-down – Slice and dice – Pivot OLAP cube is a data structure optimized for very quick data analysis.
  • 11. © Kyligence Inc. 2018. Cube: balance between space and time OLAP Cube --Key-Value Multiple Dimensional Model --Relational Classification, aggregation, and sorting
  • 12. © Kyligence Inc. 2018. Apache Kylin Architecture Overview Apache Kylin Data Analyst, BI Tools, Web App… SQL Online calculation Offline calculation Scan & filter Extract Compute Load Optimize & Rewrite
  • 13. © Kyligence Inc. 2018. SQL execution plan without Cube select l_returnflag, o_orderstatus, sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price from v_lineitem inner join v_orders on l_orderkey = o_orderkey where l_shipdate <= '1998-09-16' group by l_returnflag, o_orderstatus order by l_returnflag, o_orderstatus; Sample:Check the order return and order status relationship in a time range Sort Aggr. Filter Tables O(N) Join No cube, all need online calculations, CPU and IO intensive, latency is remarkable.
  • 14. © Kyligence Inc. 2018. SQL execution plan with Cube Cube technology speed up query performance with pre-calculation Sort Cube Filter Sort Aggr. Filter Tables O(N) Join O(flag x status x days) = O(1) Aggregated data The table join and aggregation are completed offline. Directly from aggregated data (cube) with index; Much less CPU and IO. Latency is small.
  • 15. © Kyligence Inc. 2018. ORDERS CUSTOMER SUPPLIER PART LINEITEM PARTSUPP NATION REGION Join Join Join Join Join ORDERS CUSTOMER PART LINEITEM PARTSUPP Join Join Join All rights reserved ©Kyligence Inc. http://kyligence.io Multidimensional Schema Apach Kylin supports Star-Schema, Snowflake-Schema
  • 16. © Kyligence Inc. 2018. Persistent the cube in HBase Relational to Key Value store
  • 17. © Kyligence Inc. 2018. How to query the cube Translate cube query into HBase table scan – Columns, Group by → Cuboid ID – Filters -> Scan Range (Row Key) – Aggregations -> Measure Columns (Row Values) Scan HBase table and translate HBase result into cube result – HBase Result (key + value) -> Cube Result (dimensions + measures) No Hive touch, no MapReduce job in the query time
  • 18. © Kyligence Inc. 2018. High performance & High concurrency together Sub-second latency on PB scale dataset Star schema benchmark: http://www.cs.umb.edu/~poneil/StarSchemaB.PDF SQL Latency Lower is better Data VolumeScale Lower is better
  • 19. © Kyligence Inc. 2018. Seamless integration with BI tools From open source to commercial BI
  • 21. © Kyligence Inc. 2018. Apache Kylin Use Cases Solution • Behavior Analytics • LogAnalysis • Data Mart/DW • Self-service Data Service • Retail Analytics • Financial Asset • Advertising Analytics • Real-time Analytics • Gaming Analysis Apache Kylin fits various scenarios 1000+ adoptions all of the world
  • 22. © Kyligence Inc. 2018. Use Case – Insight on Trillion Data Top1 news feed app in China
  • 23. © Kyligence Inc. 2018. Use Case: PB-level Analytics Platform Cube Storage: 971TB (almost PB) Cube numbers: 973 Cube Data Records: 8.9 Trillion rows 90%ile latency: <1.2s Frequency: 3.8 million queries / day Top O2O services provider in China Supporting all critical business lines including E-Takeaways, Hotel, Movie, LBS, Tickets… Latest updated -201808
  • 24. © Kyligence Inc. 2018. Use Case: Online Shopping Reporting https://techblog.yahoo.co.jp/oss/apache-kylin/ ▪ Our reporting system used Impala as a backend database previously. - It took a long time (about 60 sec) to show Web UI. ▪ In order to lower the latency, we moved to Apache Kylin. - Average latency < 1sec for most cases ▪ Thanks to low latency with Kylin, we become possible to focus on adding functions for users. ▪ We provide a reporting system that show statistics for store owners. - e. g. impressions, clicks and sales. The most visited website in Japan Yahoo! Japan
  • 25. © Kyligence Inc. 2018. Use Case: Data Factory for Business • Serving 18 business lines as the engine for mi’s“data factory” • Daily incremental 17 billion • 95% queries < 500ms. Leading smart phone and smart device manufacture
  • 26. © Kyligence Inc. 2018. The data platform based on Apache Kylin solved the problem of massive user queries excellently. -- Chase Zhang, Data Platform Engineer of Strikingly Performance • Use Apache Kylin to speedup analytics with Keen.io, and support high concurrency Containerizing • Apache Kylin runs on AWS ECS Integration • Developed a scheduler systemto manage all kinds of jobs Use Case – Website traffic Analytics A company to provide convenient and one stop website building solutions.
  • 28. © Kyligence Inc. 2018. Apache Kylin Roadmap • New storage support –Parquet • Real-time support • Containerization From the community
  • 30. Traditional Data Warehousing EnormousManual Effortsand Repeated Work © Kyligence Inc. 2018, Confidential.
  • 31. Human Intelligence Intelligence and Automation The future of DataAnalytics Artificial IntelligenceVS
  • 32. Augmented Analytics Platform SQL Query Log Analytic Behavior Data Schema Data Profile ML-based Discoveryof Analytic Pattern ProprietaryData Modeling Automation Self-directed Storage Layer Optimization Intelligent QueryPush- down &Routing BI Real-time Analysis Data-as-a- Service Local Deployment Cloud Platform Container Data Services © Kyligence Inc. 2018, Confidential.
  • 33. Machine Learning Augmented Analytics Available from Kyligence 3.x http://kyligence.io
  • 34. © Kyligence Inc. 2018. Kyligence Cloud Transforming Big Data Analytics to Cloud Kyligence Cloud ANSI SQL Dashboard OLAP Hadoop Customer Cloud Account client cloud Kyligence Enterprise Platform streaming Cluster Deploy Account Management Diagnosis & Optimization Queries & Reporting cloud storage tables, logs, files RDBMS (metadata) ANSI SQL Cloud Data Warehouse Cluster Management
  • 35. © Kyligence Inc. 2018. Kyligence Cloud Available: AWS, Azure, Google Cloud, Alibaba Cloud , Huawei Cloud One-click provisioning Auto Scaling High Performance Seamless Integration Intelligent Ops Deploy globally in 30 minutes Scale cluster automatically for different workloads Powered by Kyligence Analytics Platform Connect to cloud data sources Enterprise ODBC driver for BI Online diagnosis and continuous optimization Speed Upmission-critical analytics in the cloud
  • 36. © Kyligence Inc. 2018. Use Case : Replaced IBM Cognos 1 Kyligence cube replaced 800+ IBM Cognos cubes PB level (300B records) big data warehouse of both self-service aggregation query and raw data query by business analysts Self-Service Big Data Warehouse Efficient IT Operation Significantly increase IT operation efficiency as 1 Kyligence cube replacing 800 Cognos cubes with unified data access management Kyligence scale-out architecture provide best flexibility for IT infrastructure when faced with increasing analytics and concurrency demands Better flexibility of Architecture Support analysis on high granularity dimensions such as Merchant (10M cardinality) and Card (10B cardinality) Merchant or Card Multi-dimensional Analytics
  • 37. © Kyligence Inc. 2018. Use Case: Customer 360 for FMCG Azure + Kyligence ➢ 360 degree view of user profile. ➢ Powering analysts insight into data without IT ➢ HDInsight + Kyligence + Power BI
  • 38. © Kyligence Inc. 2018. Global Partners Kyligence Open Ecosystem Microsoft Azure Partner AWS Technology Partner Tableau Technology Partner Cloudera Sliver Partner MapR Converge Partner Hortonworks Community Partner Huawei Solution Partner