More Related Content Similar to Snowflake: The Good, the Bad, and the Ugly (20) More from Tyler Wishnoff (11) Snowflake: The Good, the Bad, and the Ugly1. The Good, the Bad, and the Ugly
Glenn Platkin & Kaige Liu
July 9, 2020
3. © Kyligence Inc. 2020, Confidential.
Who Am I?
30 years in software
• Computer Associates
• Business Objects
• Actuate
• Greenplum
• Kognitio
• 1010data
• Kyligence
4. © Kyligence Inc. 2020, Confidential.
Who Am I?
30 years in software
• Computer Associates
• Business Objects
• Actuate
• Greenplum
• Kognitio
• 1010data
• Kyligence
6. © Kyligence Inc. 2020, Confidential.
What I've Seen
Is it the 1980s or Today?
Common Themes
• A LOT of data
• Slow hardware
• Difficult/limited tools
• Unhappy users
And Yet
• Users want even more data
• Hardware has its limits
• Too many tools
• Upgrades/conversions/legacy
• Cost/time/training
• DBA’s render the data useless
• True analytics = detail + history
• Users are still unhappy
Reaction
• Newest shiny BI/reporting tool
• More/bigger/faster hardware
• Team of DBA's
indexing/aggregating/limit
history
7. © Kyligence Inc. 2020, Confidential.
Summary
We've Come a Long Way . . . or Have We?
Hadoop/Spark Clusters
Columnar DBMS
In-Memory
Appliances
DB Machines
BI Tools
Analytics Tools
Mainframe
Tape/Files/Databases
Cobol/RPG/4-GL's
Minicomputer
Relational DBMS
Reporting Tools
Client-Server
Data Marts/Warehouse
BI/Multi-Dimensional Tools
Cloud
Next-Generation Analytics
Machine Learning
~1980s 1990s 2000s 2005 2010 2020
8. © Kyligence Inc. 2020, Confidential.
• Easy to Get Started
• Low Cost of Entry
• Secure
• Intuitive
• Autonomy/Control
Snowflake – The Good
• Consolidated Data
• Flexibility
• Less Management
• Reduced DBA
• Transparent
9. © Kyligence Inc. 2020, Confidential.
The Promise
• Lots of detail, no more aggregations
• Lots of history, no more
limitations
• Use your tool of choice
• Finally – True Analytics
• No Impact on Production
• Happy Users
Snowflake – The Bad
The Reality
• The more data you have,
• The greater its complexity,
• The higher your concurrency:
• The greater the impact on performance
• Requiring additional compute
• DBA's to aggregate
• Higher monthly cost
Especially for Enterprise Accounts
At best, poor/mediocre results
10. © Kyligence Inc. 2020, Confidential.
Snowflake – The Ugly
Your Snowflake Monthly Statement
• Pay-per-Use Model
• GREAT – Just pay for what I use!!
• You pay for:
• Storage, Compute, Users
• B.Y.O.C. – Not an option, because there is too much profit
• Up-Lift: Standard, Premier, Enterprise, Premium, Business Critical
Do they care if your queries take longer?
Do they have a financial incentive to speed up your queries?
Especially for Enterprise Accounts
11. © Kyligence Inc. 2020, Confidential.
Top-3 Most Common Issues with Snowflake
Have we gone full circle, AGAIN?
#3. NOT Purpose-Built for Big Data / Performance
• Require M/B/Trillions of Rows of Detail
• High-Concurrency
• Complex Requests
• Unhappy Users
12. © Kyligence Inc. 2020, Confidential.
Top-3 Most Common Issues with Snowflake
Have we gone full circle, AGAIN?
#3. NOT Purpose-Built for Big Data / Performance
• Require M/B/Trillions of Rows of Detail
• High-Concurrency
• Complex Requests
• Unhappy Users
#2. Limited Remedies
• More Compute
• Limit Data
• DBA's Indexing/Aggregating
13. © Kyligence Inc. 2020, Confidential.
Top-3 Most Common Issues with Snowflake
Have we gone full circle, AGAIN?
#3. NOT Purpose-Built for Big Data / Performance
• Require M/B/Trillions of Rows of Detail
• High-Concurrency
• Complex Requests
• Unhappy Users
#2. Limited Remedies
• More Compute
• Limit Data
• DBA's Indexing/Aggregating
#1. Your Snowflake Monthly Statement
• Out-of-Control Costs
• Mediocre Results
• Unhappy CFO
14. © Kyligence Inc. 2020, Confidential.
Dilemma . . .
Requirements & Challenges
• Teradata
• Many Petabytes of Detail Data
• Complex Queries
• Many Users/High Concurrency
• Using a Variety of BI/Reporting/Analysis Tools
• High Performance at Any Scale
15. © Kyligence Inc. 2020, Confidential.
Requirements & Challenges
• Teradata
• Many Petabytes of Detail Data
• Complex Queries
• Many Users/High Concurrency
• Using a Variety of BI/Reporting/Analysis Tools
• High Performance at Any Scale
Solution . . .
Plan A:
• What was commercially available?
• Netezza, Greenplum, Vertica,
Postgres, others
• None could scale
• None could perform
• All were too complex to
implement and maintain
Plan B:
• Build
16. © Kyligence Inc. 2020, Confidential.
Solution
• OLAP / Pre-Compute
• Ultra-High Performance
• High Concurrency
• No Performance Degradation
• Reduced Complexity
• Easily Handles Complex Queries
• Slice/Dice/Pivot
• Supports BI Tools - SQL
• Unlimited Scale
• Data/Storage
• Compute
• Users/Concurrency
• Low Management
• Easy Implementation
• Cost-Effective
17. © Kyligence Inc. 2020, Confidential.
Apache Kylin
• 2013 - Created at eBay
• 2014 - Apache Open Source Foundation
• 2015 - Graduated Top-Level Project
• 2016 - Kylin ---> Kyligence
• Kyligence Enterprise
• Kyligence Cloud
• Kyligence
• Original Kylin members
• Still maintains/manages the Kylin community
• Most active contributors to Kylin
18. © Kyligence Inc. 2020, Confidential.
Kyligence
• Founded in 2016 by the creators of Apache Kylin
• Built around Kylin with augmented AI, enhanced to deliver
unprecedented enterprise analytic performance
• CRN Top-10 big data startups in 2018
• Global Presence: San Jose, Seattle, New York, Shanghai, Beijing
• VCs: Fidelity International, Shunwei Capital, Broadband Capital,
Redpoint, Cisco, Coatue
2016
Founded Pre-A
Redpoint
Cisco
2017
Series A
CBC
Shunwei
2018
Series B
8Roads
2019
Series C
Coatue
28. © Kyligence Inc. 2020, Confidential.
Joint Solution: Kyligence + Snowflake
Applications
Semantic and
Augmented Models
Data Warehouse
Data Source
Benefits
• Ultra-low latency for
aggregated queries
• High concurrency
• Unified Semantic Layer
• Support Excel Pivot table
and other MDX scenarios
• Cost effective
Database Events Files IoT
Unified Semantic Layer
BI
Integration
Access
Control
Enterprise
Security
AI-Based
Indexes
Scalable Query
Engine Augmented Models
31. © Kyligence Inc. 2020, Confidential.
Kyligence Cloud - Accelerate Mission-Critical Analytics
Intelligently
• Unified Query Entrance
ODBC/JDBC API/SDK
Finance Marketing Sales Customer Checkout
Aggregate Index
10%4% 80%
Data
Lake
SQL/MDX
Semantic Services
6%
Distributed
Query Engine
AI-Augmented
Engine
Smart
Pushdown
Metadata
Management
Enterprise
Security
• Business Semantic Layer
• Query Pattern for All Data
• High-Performance Engine
32. © Kyligence Inc. 2020, Confidential.
AI-Augmented Engine — Learn From Your Analytics
History
33. © Kyligence Inc. 2020, Confidential.
AI-Augmented Engine — Learn From Your Analytics
History
34. © Kyligence Inc. 2020, Confidential.
Under the Hood: Smart Cuboids
• Each model consists of N-dimension cuboids, which is a combination of several
dimensions in different permutations and combinations.
• Apache Spark is used to build the cuboids, making query results extremely fast.
• When the user sends a query, the model intelligently looks for the
cuboid/segment and quickly returns the results.
35. © Kyligence Inc. 2020, Confidential.
Kyligence - A Paradigm Shift in Cloud Analytics
O(N)
O(1)
Data Volume
Response time
Pre-Computation
Online Calculation
Consistent Ultra-low Latency by Kyligence
36. © Kyligence Inc. 2020, Confidential.
TPCH Benchmark
SF=50
Query Response Time | 0.5 Billion
SF=500
Query Response Time | 5 Billion
• No warmup
• Lower is better
• Run each query 3 times
• Record the average time
For Each Dataset:
37. © Kyligence Inc. 2020, Confidential.
Unified Semantic Layer
Unified Semantic Layer
Hierarchies/KPIs/Calculated Measures/Alias
Excel Tableau PowerBI
38. © Kyligence Inc. 2020, Confidential.
Elastic Scaling — Handle Peak Time Automatically
• Fewer compute and storage
resources utilized
• Dynamic on-demand cluster resizing
• Uses spot instances
• Efficient planning for data growth
39. © Kyligence Inc. 2020, Confidential.
TCO Benchmark
• Build once, query anytime
• More queries, cheaper
• Will not increase sharply when
dealing with high concurrency
41. © Kyligence Inc. 2020, Confidential.
Top-3 Most Common Issues with Snowflake
#3. NOT Purpose-Built for Big Data Performance Purpose-Built for Big Data & Performance
• BIG Data (billions/trillions of rows of detail)
• High Concurrency
• Ultra-High Performance (sub-second response)
• Complex Requests (Hierarchies, High
Cardinality, COUNT_DISTINCT)
• Flat/Consistent Performance
42. © Kyligence Inc. 2020, Confidential.
Top-3 Most Common Issues with Snowflake
#2. Limited Remedies UNLIMITED Scale
• Unlimited Data/Users/Compute
• Little/No Maintenance
• Secure
• Unified Semantic Layer
• Any SQL/MDX-Based Tools & Excel
43. © Kyligence Inc. 2020, Confidential.
Top-3 Most Common Issues with Snowflake
#1. Your Snowflake Monthly Statement
#2. Limited Remedies
#3. NOT Purpose-Built for Big Data / Performance
Cost-Effective/Fixed-Cost
• B.Y.O.C.
• Sensible Subscription-Based Licensing
• Unlimited Users
• Unlimited DW/DL Size
• Fee Based on "Raw/Consumed Data"
• Not all data is consumed
• Detail is not, it's indexed
• Consumed data is used for pre-compute
• Small Required Storage
44. © Kyligence Inc. 2020, Confidential.
Kyligence + Snowflake = Perfect Couple
Right Solution for the Correct Challenge
• B.Y.O.C
• Any SQL/MDX-Based Tool
• Snowflake/Any Data Source
• Limitless Scale
• Ultra-High Performance
• True Analytics
• Unified Semantic Layer
• Simple, Easy, Predictable Licensing and
Cost-Effective
• Perfect Complement to Snowflake!
46. © Kyligence Inc. 2020, Confidential.
Test Drive Kyligence for Free Today: https://kyligence.io/free-
trial/
Follow us on Twitter and LinkedIn @Kyligence
Contact Us: info@kyligence.io
Editor's Notes Glenn - short overview Kyligence. + customers
Glenn - short overview Kyligence. + customers Glenn - short overview Kyligence. + customers Glenn - short overview Kyligence. + customers Kai - talk about Kyligence
short intro. joint solution Kyligence + Snowflake = overall benefit Kai deep dive - how: talk about technicals. Kyligence + Snowflake
How did Kyligence achieve this that Snowflake can't. Everything should be tied to Snowflake. Comparison. Snowflake does X. we Do X. always bring in Snowflake somehow. Stay relevant to topic at hand. How Kylgence fix snowflake problems.
-address technical guys, how we fit into snowflake
-fit into your architecture The 'How"snowflake integrates w kyligence Can piick and choose which queries want to accelerate from snowflake. Technical details why we are faster than kyligence
Simple graph, closing – we are fast
Show lower TCO
Shirley Q: doesn't snowflake have elast Glenn - short overview Kyligence. + customers
Glenn - short overview Kyligence. + customers
Glenn - short overview Kyligence. + customers
Glenn - short overview Kyligence. + customers
Glenn - short overview Kyligence. + customers