You’re using Apache Hadoop and cloud-based data platforms, but can your BI and analytics tools keep up? Can you provide fast, secure, self-service access to all the data business users want?
Analyzing big data poses multiple challenges. Highly parallel distributed data architecture is one solution, but until recently it has been mostly limited to databases, not business intelligence (BI) application servers.
Join this informative webinar with guest speaker Boris Evelson, VP and principal analyst at Forrester Research, and Priyank Patel, co-founder and chief product officer at Arcadia Data. Enterprise architects, data scientists, and application development and delivery (AD&D) pros will learn:
What is a distributed BI platform? How is it different from existing BI tools?
How to scale BI and visual analytics for users without moving data
What features matter most for distributed BI platforms for Hadoop
How to unify security natively in Hadoop without more administration
3. Arcadia Data. Proprietary and Confidential3
Presented by:
Boris Evelson
Vice President, Principal Analyst
Priyank Patel
Co-Founder, CPO
4. Arcadia Data. Proprietary and Confidential4
Agenda:
1. Systems Of Insight (SOI) – Next Gen BI:
How to scale SOI with native Hadoop BI platforms;
Boris Evelson, Forrester
2. Scale BI & Visual Analytics with Big Data;
Priyank Patel, Arcadia Data
3. Q&A
5. Arcadia Data. Proprietary and Confidential5
Poll:
a) Gathering knowledge - thinking about Hadoop or other scale-out data platforms.
b) Developing strategy - defining architecture, selecting tools.
c) Piloting - have big data analytics platform in place and beginning to experiment
d) Deployed - have defined use case and end-users are accessing and analyzing data
Where are you with your big data deployment?
Submit your answers in the “vote” tab of BrightTALK!
33. Arcadia Data. Proprietary and Confidential33
Poll:
a) Development tools (e.g. Spark, MapReduce)
b) SQL engines (e.g. Hive, Impala, SparkSQL, Drill)
c) Traditional BI tools (e.g. Tableau, Qlik, MicroStrategy)
d) Hadoop-native, distributed BI platforms
e) Other (please specify in the comments section)
How do you plan to give users access to analyze their data?
Submit your answers in the “vote” tab of BrightTALK!
34. Arcadia Data. Proprietary and Confidential
Scale BI & Visual Analytics with Big Data
Priyank Patel
Co-founder and VP Products
35. Arcadia Data. Proprietary and Confidential35
Challenges with BI tools on Big Data
Data summarization
Big data fidelity loss
No access to real-time data
Higher security risk
Management and
operational complexity
High TCO with multiple
systems
35
BI/VIZ
TOOLS
BI/SERVER
(CUBES)
DATA MART
(EXTRACTS)
DATA WAREHOUSE
DATA USERS
/ANALYSTS
Order Book Market Data
Electronic
Communications
Trader Data OATS
Operational Data Sources
36. Arcadia Data. Proprietary and Confidential36
Challenges with BI tools on Big Data
Data summarization
Big data fidelity loss
No access to real-time data
Higher security risk
Management and
operational complexity
High TCO with multiple
systems
36
Order Book Market Data
Electronic
Communications
Trader Data OATS
Operational Data Sources
100s of silos
37. Arcadia Data. Proprietary and Confidential37
Start with a Data Lake strategy …
ALL
DATA
Sensor data Clickstreams Security Logs CRM data Transactions
Data Lake
38. Arcadia Data. Proprietary and Confidential38
But a Data Lake alone is not enough…
38
BI/VIZ
TOOLS
BI/SERVER
(CUBES)
DATA MART
(EXTRACTS)
DATA WAREHOUSE
(EXTRACTS)
ALL
DATA
<EXTRACTS>
DATA USERS
/ANALYSTS
Operational Data Sources
Data
Lake
Data Lake becomes a
data dump
Data Consumption
Problems Remain
Sensor data Clickstreams Security Logs CRM data Transactions
39. Arcadia Data. Proprietary and Confidential39
Arcadia Makes it Simple
Operational Data Sources
Data Lake
Sensor data Clickstreams Security Logs CRM data Transactions
40. Arcadia Data. Proprietary and Confidential40
Data-native BI & Visual Analytics
Arcadia Data 2016. Proprietary and Confidential
Arcadia Data
is a
Hadoop-native
platform
that connects
business
users to big
data
Di st r i but ed
BI & Anal yt i cs
Engi ne r uns
on each Hadoop
node
User s connect
vi a a web
br owser
BROWSER BASEDDATA-DRIVEN APPS
BROWSER BASED
BIG DATA OS
Distributed execution,
data storage (HDFS, S3, object stores) , metadata,
security
DATA-NATIVE COMPUTE ENGINE
On-Premise : Scales inside Hadoop Clusters
In-Cloud : Elastically scales with compute resources
WEB BASED INTERFACE
Drag & drop interface focused on BI and exploratory
analytics, edit and publish from the same place
41. AGILITY
Explore quickly & directly -
don’t start with data marts,
cubes, or extracts
APPLICATIONS
Actionable applications with
embeddability
Why does a data-native architecture matter for scaling BI ?
Simple visual interface to exploration and
semantic modeling on ALL of your data
Active data store continuously models data
based on usage for fast concurrent access
Production-quality dashboards
and customer applications.
Support for real time as well as
free text based analysis.
Point-and-click micro-
segmentation and time-series
event analytics
42. Runs directly on your Hadoop or cloud cluster. No cubes. No extracts.
Hundreds
of concurrent
business users
Sub-second
performance
for production reports
Thousands of
shared data driven
applications
100s of
billions
of rows
Agility in a big data environment
43. Arcadia Data. Proprietary and Confidential43
Hadoop Cluster
Results
(100x Faster)
Eliminate dependence on cubes
Consumption Layer
Processing Layer
Smart Acceleration™
1. Start with exploration of raw data, no
need to determine design of
acceleration structures such as cubes
ahead of time
2. Recommendation engine generates
AVs (derived forms of raw data)
based on dynamic data usage within
Hadoop cluster
3. Re-routes data queries to AVs
transparently providing automated
acceleration when needed for
production/high concurrency uses
Automatically modeled and
maintained within Hadoop cluster
Keep logical data models simple
without needing to target specific data
cube structures
1
2
3Queries
Queries
automatically
redirected
In-memory
Analytical Views
Recommendation
Engine
Stores Derived Forms of
Raw Data in Hadoop
Raw Data in Hadoop
44. Arcadia Data. Proprietary and Confidential44
Access all data : Relational, Real Time, NoSQL and Search
45. AGILITY
Explore quickly & directly -
don’t start with data marts,
cubes, or extracts
APPLICATIONS
Actionable applications with
embeddability
Why does a data-native architecture matter for scaling BI ?
Production-quality dashboards
and customer applications.
Support for real time as well as
free text based analysis.
Point-and-click micro-
segmentation, event analytics,
dimension/measure correlations
46. Juxtaposing Real-time and Historical in One View
Visuals are
coherent and
permit interaction
across data
sources
Real-time
feed from
Apache Solr
or Spark
Streaming
Drill to
detail in
Kudu or HDFS
47. Arcadia Data. Proprietary and Confidential47
Cross-connection Data Blending
Visuals are
coherent and
permit interaction
across data
sources
Visual
from
Oracle
Visual
from
Teradata
Visuals
from
Apache Hadoop
48. Arcadia Data. Proprietary and Confidential48
Build Modern Web Applications Driven by Your Big Data
HR
5
5"
49. Arcadia Data. Proprietary and Confidential4949
Ad tech
Trade surveillance for high
velocity trade volume across
exchanges to identify and
prevent abusive trade behavior
Cybersecurity app to capture
investigative workflows, real-
time incident response, and
guided data exploration
Developed a new SaaS self-
service analytics platform
to give their customers better
marketing attribution
Gives global brand
managers digital campaign
intelligence across 100+
brands
INNOVATION
REDUCE RISK
Government
Improve patient outcomes
on 10+ million members by
predicting and controlling re-
admission risk.
Turn IoT data from enterprise
data servers into meaningful
lifecycle analytics data
service
50. For three years, we've been evaluating the
market for a BI product...
Arcadia Enterprise is the first product we found
that provides truly on-cluster Hadoop BI
…Its execution model and user self-service
approach deliver performance at Hadoop scale,
and lets us develop our analytics quickly.
— Terry McFadden
Associate Director, Global Business Services,
Procter & Gamble
“
”
51. Arcadia Data. Proprietary and Confidential51
Scaling BI and Visual Analytics enables value from Big Data
51
AGILITY
Explore quickly & directly - don’t
start with data marts, cubes, or
extracts
APPLICATIONS
Actionable applications,
no coding required
ARCHITECTURE
A powerful,
simplified architecture
53. Arcadia Data. Proprietary and Confidential53
More Resources:
• Forrester Research: How to Scale Business Intelligence
with Hadoop-Based Platforms
• https://www.arcadiadata.com/lp/forrester-research-scale-hadoop-BI
• Forrester Wave: Native Hadoop BI, Q3 2016
• https://www.arcadiadata.com/lp/forrester-wave-hadoop-bi-research-report/
Arcadia Data will send out links to these reports after the webinar
55. Arcadia Data. Proprietary and Confidential55
Four Approaches for Big Data Analytics
55
Data-Native Visual
Analytics
Data-Native
Application
Fast SQL + BI Tools
(ODBC/JDBC, Hive, Spark,
Impala, Drill …)
BI Server
Scale
Agilit
y
Static cubes only.
No granular data access.Won’t scale. Summaries only.
Simple SQL. 1-5 users.
Real-time & dynamic.
100s to 1000s of users.
Cubes
Edge Node
Move Data to BI Server
Separate BI Server
BI Server
No access to real-
time, streaming,
unstructured data
56. Arcadia Data. Proprietary and Confidential56
Big Data Analytics: Alternatives
56
Capability
Separate
BI Server
Hadoop SQL Engines
+ BI Tool
Big Data
“Cubes”
Data-Native
Visual Analytics
Dashboards and reporting ✓ ✓ ✓ ✓
Real-time visualizations ✘ ✘ ✘ ✓
Data Applications ✘ ✘ ✘ ✓
High user concurrency ✓ ✘ ✓ ✓
Ad-hoc drill to detail ✘ -- ✘ ✓
In-Hadoop advanced analytics
(e.g., customer engagement flows, micro-segmentation) ✘ ✘ ✘ ✓
Multi-structured data access
(e.g. NoSQL, S3, files, search) -- ✓ ✘ ✓
Unified Security ✘ ✘ ✘ ✓
Unified Administration ✘ ✘ ✘ ✓
Lower TCO ✘ ✘ ✘ ✓
57. Arcadia Data. Proprietary and Confidential57
Data-Native Visual Analytics Architecture
NO DATA MOVEMENT
No data extracts
Analytics at the highest granularity
SELF-SERVICE BI ON BIG DATA
Web-based UI
Collaboration on “live” data apps
NATIVE MANAGEMENT & SECURITY
Single system to manage and
secure
Integrated with Apache Sentry,
Apache Ranger
57
Real-Time Streams and Processing
Cloud On-Prem
Batch and Interactive