With cloud object storage, you'd expect business intelligence (BI) applications to benefit from the scale of data and real-time analytics. However, traditional BI in the cloud surfaces non-obvious challenges. Priyank Patel reviews service-oriented cloud design (storage, compute, catalog, security, SQL) and shows how native cloud BI provides analytic depth, low cost, and high performance.
1. Arcadia Data. Proprietary and Confidential
Design Patterns for BI in the Cloud
Priyank Patel, Co-founder and CPO
March 28th, 2019
2. Arcadia Data. Proprietary and Confidential
Canonical Architecture for BI – Circa 2000s
Data warehouse
In-memory Cubing
Data mart
Visualization
Data Sources
Customer
Records
Customer
Information
Product Data
Industry-Wide
Data
Unstructured
Documents
3. Arcadia Data. Proprietary and Confidential
Key Principles of the Legacy Design…
MPP
Data warehouse
In-memory Cubing
Data mart
Visualization
Data Sources
Customer
Records
Customer
Information
Product Data
Industry-Wide
Data
Unstructured
Documents
Co-located compute and storage
Scale together
Data duplicated in multiple systems
Governance and security overhead
Data summarized
Choice between high performance or high granularity
4. Arcadia Data. Proprietary and Confidential
Cloud is Built on Decoupled Compute and Storage
4
…
Compute Compute Compute
S3 ADLS GCS
Cloud Storage
Cloud Compute
instances, containers, server less
… …
5. Arcadia Data. Proprietary and Confidential
Lift-and-Shift Existing Architecture to the Cloud
Data warehouse
In-memory Cubing
Data mart
Visualization
Data Sources
Customer
Records
Customer
Information
Product Data
Industry-Wide
Data
Unstructured
Documents
6. Arcadia Data. Proprietary and Confidential
Lift-and-Shift Existing Architecture to the Cloud
Data Warehouse
In-memory Cubing
Data Mart
Visualization
Data Sources
Customer
Records
Customer
Information
Product Data
Industry-Wide
Data
Unstructured
Documents
Cloud Storage
Cloud Lake
7. Arcadia Data. Proprietary and Confidential
Lift-and-Shift Architecture
Data Warehouse
In-memory Cubing
Data Mart
Visualization
Data Sources
Customer
Records
Customer
Information
Product Data
Industry-Wide
Data
Unstructured
Documents
Cloud Storage
Cloud Data Lake
Well suited when :
Analytics can be done with small or medium
scale data
Misses the mark by :
Increasing governance and data movement
burden
Higher security vuln footprint
Increasing latency to analytics
10. Arcadia Data. Proprietary and Confidential
Data warehouse
In-memory Cubing
Data mart
Visualization
Data Sources
Customer
Records
Customer
Information
Product Data
Industry-Wide
Data
Unstructured
Documents
Cloud storage
Cloud lake
Instead of
moving
data to BI …
Move BI to data
11. Arcadia Data. Proprietary and Confidential
Cloud Native BI Architecture
Data warehouse
In-memory Cubing
Data mart
Visualization
Data Sources
Customer
Records
Customer
Information
Product Data
Industry-Wide
Data
Unstructured
Documents
Cloud storage
Cloud lake
Compute Compute Compute
Distributed
BI/OLAP
Move BI to data
12. Arcadia Data. Proprietary and Confidential
Modernize BI Architecture: “In-Data Lake BI”
12
“ 3. Bring BI to data — physically:
… now you can "bring BI to data": run BI/analytics on exactly the same platform where the
DBMS is located. Such in-data-lake BI architecture reduces LAN/WAN data movement;
eliminates "choke points" like JDBC connectors; and enables BI applications, not just
DBMSes, to be fully distributed.
In addition to these technical merits, there's also a real business benefit: Curating and
modeling data before it is analyzed limits BI use cases. Analyzing data "as is" broadens the
number of questions it can answer.”
13. Arcadia Data. Proprietary and Confidential
Leading Companies are Now Choosing Two Enterprise BI Standards
14. Arcadia Data. Proprietary and Confidential
Business users can easily analyze data.
Brings BI to data in existing data platforms.
Integrated to shared metadata and security.
Requirements for Modern BI Native to Cloud and Data Lakes
Self-Service
In-Data-Lake BI
(Distributed BI)
Shared-service
15. Arcadia Data. Proprietary and Confidential
Arcadia Enterprise
ArcEngine. Distributed BI engine
runs directly on cloud storage.
Elastic and containerized
Analytical Views. Pre-computed partial
aggregates to accelerate reports and
analytics.
Automatically recommended
Incrementally maintained
ArcViz. Native Visualizations.
Cloud Instances or Containers
3
2
1
3rd Party BI. Connect to any BI tool of choice3
Distributed BI and
OLAP
Cloud Object Storage
Smart
Acceleration
16. Arcadia Data. Proprietary and Confidential16
See Arcadia Smart Acceleration in Action
6 Minute video:
“Learn Arcadia Data - Smart Acceleration and Analytical Views”
https://www.youtube.com/watch?v=3VOYot7QfWM
17. Arcadia Data. Proprietary and Confidential
Data Drives Market Disruption
Sometimes a lift-and-shift architecture works
Data is small or medium sized
But a decoupling of compute and storage is truly powerful only if the architecture of the
BI engines leverage it
No duplication of data
Distributed BI and OLAP capability, elastic scale up and down
Automatically optimize
In Summary
17
18. Social media: @arcadiadataarcadiadata.com 18
Data Lake Analytics
Research Benchmark
See Search-Based BI in
Action
Download
Arcadia Instant
arcadiadata.com/data-lake-bi arcadiadata.com/product/
search-based-bi/
arcadiadata.com/instant