How Hewlett Packard Enterprise Gets Real with IoT Analytics

Arcadia Data. Proprietary and Confidential
How Hewlett-Packard Enterprise
Gets Real with IoT Analytics
June 26, 2018

Arcadia Data. Proprietary and Confidential2
Featured Speakers
Dale Kim
Sr. Director, Products/Solutions
Arcadia Data
Siamak Nazari
Chief Software Architect, HP Fellow
Hewlett-Packard Enterprise

 If you have any questions along the way, please type them into the chat window.
 If you have audio problems, please chat us for help.
 A recording of this presentation will be sent to you in a few days.
 Please live tweet! @arcadiadata #bigdata #analytics
Before We Begin Our Presentation

The HPE Storage
Big Data Journey

Introduction to HPE Storage
–What my business unit does
– HPE Storage is a business unit of
Hewlett-Packard Enterprise
– Produces efficient application-integrated data
storage solutions (“Tier 1 Storage Arrays”)
– Lets customers start small and scale without limits
–What I do
– Chief software architect for HPE 3Par
– Set technical direction for the software
– Currently focus on solid state storage systems and
software systems for a new class of storage
As a start…

What We Were Trying to Do
–Storage arrays create a massive
amount of meta-data
– Software, hardware, device, and sensor
data
– Ongoing status/health, performance,
configuration changes, diagnostic events
–Data analysis could benefit multiple
functional teams
– Product management – improve product
– Sales and marketing – identify new
opportunities in the market
– Customer satisfaction – troubleshooting
–Needed to analyze data in
consumable way, with granular drill
down
– Data was already being collected as
millions of text files
– Important to compare real-time data versus
historical trends
– Needed quick access across different
teams to an accurate and complete picture

The Primary Challenges
– Scale
– Tens of thousands of storage arrays at
customer data centers
– Hundreds of millions of data points every
24 hours
– Volumes continued to grow
– Data format
– Text files needed to be converted into
analyzable format
– Addressing speed and functional
requirements for analytics
– Previous pilots on RDBMSs fell short
We needed to address:

Our Big Data Solution
–Adopt a true big data platform
– We turned to Apache Hadoop for the data
storage
– Hadoop was not previously used
– Became the platform of record for big data
–Added a “BI on Hadoop” analytics
platform
– Arcadia Data was architected for big data
– Runs directly on the Hadoop cluster
– Provided the speed, scale, global view, and
access to granular data
– Use of Arcadia Data simplified many
aspects of getting started with Hadoop

Our Results
– Successful deployment despite
inexperience with Hadoop
– Moved to production within 6 months of starting
from scratch
– Loading 60GB/hour
– End user experience requirements were
addressed
– Single global view of utilization, duty cycles,
upgrades, feature uptake, utilization trends,
failure notifications, failure trends, etc.
– Arcadia Data query acceleration provided fast
query results
– Data security was enabled to limit access to
authorized users
– Current and historical data analysis
– Unstructured data analytics possible for the first
time
We achieved:

Our Results (continued)
– Business opportunities are being realized
– Identify potential sales opportunities
– Examine and fix underutilized and poorly
provisioned systems
– Study equipment and component reliability
– Offer suggestions to customers on product
usage and future plans
– Processing time was drastically reduced
– Event analysis scripts formerly take hours or
days
– Previously had to be customized for every
analytical problem
– With the new solution, complicated queries
would return in seconds
We achieved:

Overall snapshot of entire
install base as well as
drill down into smallest
component details
Install Base Overview

Feature licenses over
time brought insight
to product team about
uptake and to
Marketing for pricing
Installations and
Software versions
trend analysis
Trends and Usage

Drive failures by
type, model and
firmware analyzed
over time helped
quality control
and customer service
Failure Analysis and Anomaly Detection

Systems with low free
capacity reflect an
opportunity to sell more
capacity
Systems with high free
capacity reflect disuse
and potential competitor
threats.
Drill down of various
component capacities
over time enables sales
rep to have a more
detailed conversation
Sales Opportunities and Threats Based on Free Space

Unstructured Data Analysis
Filters for
Date-Hour, SystemID
Filters for
Include and Exclude
event string patterns
Subsequent Event
Aggregation by System
Exploratory Query : 1m 0s
Full Data Query : 8m 38s
All Events for 1 wk : 31 B

Combined Structured and Unstructured Analysis
3.Clicking on a
system retrieves
its most recent
raw event log
1.Search for an
event pattern and
find the release
versions that most
hit that pattern
2. Clicking on the
release version
reveals the systems
that most hit the
pattern
4.Raw event log shows
the surrounding
context around
the events

Recommendations
– If you have IoT data to analyze, think in
terms of big data
– IoT analytics is a big data problem
– Start with big data technologies, especially if
scale is an issue
– You might be successful with standard
technologies on big data, but more than likely
you’ll spend more time on them than necessary
– Give multiple teams access to the data lake
– You can start small, but think long term as well
We achieved:

Arcadia Data. Proprietary and Confidential
Drill-Down into the
Analytics Platform

Enterprises Today Need Two Separate BI Standards

“Data” and “Platforms” Have Changed – Why Haven’t BI Tools?
From To
Data
Platforms
BI Tools
rows and columns and multi-structured
batch and interactive and real-time
small and large volumes
many sources
internal and external
tables and docs, search indexes, events
schema on write and schema on read
commodity hardware
ETL and ELT and ELDT
data warehouses and data lakes
rows and columns
batch
smaller data volumes
limited # sources
mainly internal
tables
schema on write
proprietary hardware
ETL
data warehouses
SQL queries
extracts
cubes
BI servers
small/med scale
Why haven’t
BI tools
evolved?

BI Built for Data Warehouses Fails Us in Data Lakes, Because…
Agile only in name
Pathway to production slow, requires multiple
steps, data duplication and pre-
summarization. Time-to-insight is delayed.
Extract to EDW?
Summarize on BI
Server?
Replicate
Security?
Acquire New
Hardware?
Inefficient scale
Scale to large data comes
at reduced concurrent
access for users.
# users
datavolume
good here
bad here
Cannot handle data variety
Big data is structured + real-time and
streaming + complex + unstructured
structured
multi-structured
small
big
batch
streaming
external
internal
✓
✘

BI for Data Lakes Must be Architected for Scale and Performance
Edge Node JDBC
BI Server
Data Warehouse BI Architecture
• BI Server can’t scale out
• Significant data movement, modeling, security management
Data Lake Cluster
“Big Data” BI Architecture
• Edge node BI server only scales via long planning
• Performance optimizations require heavy IT intervention
• Only passing SQL with no semantic information (e.g., filters)
Native BI within Data Lake Architecture
• Scales linearly with DataNodes while retaining agility
• Semantic model is “pushed down” and distributed
• Highly optimized “based on usage” physical model
• No data movement; single security model
Native BI = “Lossless”, high-definition analytics
DataNodes
BI Front-End
DataNodes + Arcadia
Data Lake Cluster
BI Front-End
Edge Node BI Server DataNodes
Data Lake Cluster
BI Front-End

Data Warehouse BI Architecture
23
BI Server
Analytic Process
Optimize Physical
Semantic Layer
Secure Data
Load Data
Big Data Requirements
Native Connection
Semi-Structured
Parallel
Real-time
Data Warehouse
(RDBMS)

Data Lake BI Architecture
24
BI Server
Analytic Process
Optimize Physical
Semantic Layer
Secure Data
Load Data
Big Data Requirements
Native Connection
Semi-Structured
Parallel
Real-time
Data Warehouse
(RDBMS)
Data Lake
(HDFS, Cloud Object Storage)
You need a BI platform
that runs natively within
data lakes

Query acceleration for
scale, performance,
and concurrency
Smart Acceleration Leverages What Is Learned during Data Discovery
Ad hoc
queries
Arcadia Enterprise makes
recommendations –
build these with a click.
Hadoop Cluster
• Fast query responses
• Minimal modeling
• Live acceleration (no downtime)
All Granular Data
Analytical
Views
Accelerated
application queries

Sample Query Acceleration Comparisons
Accelerated next to
unaccelerated
No numbers for unaccelerated queries – queries
did not return in reasonable time frame.
Queries

Visual Analytics and BI Native to Data Lakes
BI Native to Data Lakes Simplifies the Analytic Process
Data Warehouse or Data Lake
Traditional BI Server
 One security model
 No movement of data
 Self-Service Discovery
 AI-driven performance
modeling
 Production ReadyTime to Insight/Value in Days
BI Deployment Delayed Weeks
Time to Insight/Value in Weeks or Months
Extract and
Secure
Land / Secure
Data
Build Semantic
Layer
Analytical
Discovery
AI-driven
Performance
Modeling
Production
Land / Secure Data
Performance
Modeling –
Cubes /
Aggregates
Analytical
Discovery
Production
Transform 3NF
or Star Schema
Build Semantic
Layer
Performance
Modeling
(both places)
Data
Movement

 Scale without compromise
 Enable real-time, streaming analytics
 Unlock complex data not easily reachable before
 Act directly from your data discovery
 Optimize and productionize based on usage and need
Native BI Unleashes the Power and Flexibility of Your Data Lake

Advanced Visualizations
and Semantic Layer
Arcadia Is Native BI Built from the Ground Up for Data Lakes
• In-cluster for
high performance, high
concurrency
• Distributed BI on every node
• No data movement
• Unified security
• Single semantic layer
• Blend with additional data
sources, including S3
Data Lake on Hadoop Cluster
Data Node Data Node Data Node
Data Node Data Node
… … … …
… … …
Streaming (via KSQL)
Other external
data sources
Azure Data
Lake Store

Social media: @arcadiadataarcadiadata.com
30
Find more IoT information in
our Resource Center
Try Arcadia Instant– Free
Download
Read our blog for more
about big data
arcadiadata.com/resources arcadiadata.com/Instant arcadiadata.com/blog
Thank You
Read more about how we help with
Internet of things.
https://www.arcadiadata.com/solutions/iot
Gartner names Arcadia Data a 2017
Cool Vendor for IoT Analytics, April
2017.

How Hewlett Packard Enterprise Gets Real with IoT Analytics

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a How Hewlett Packard Enterprise Gets Real with IoT Analytics

Semelhante a How Hewlett Packard Enterprise Gets Real with IoT Analytics (20)

Mais de Arcadia Data

Mais de Arcadia Data (10)

Último

Último (20)

How Hewlett Packard Enterprise Gets Real with IoT Analytics

Notas do Editor