More Related Content Similar to HP Vertica and MapR Webinar: Building a Business Case for SQL-on-Hadoop (20) More from MapR Technologies (20) HP Vertica and MapR Webinar: Building a Business Case for SQL-on-Hadoop1. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Derive fast value from all
your Big Data
HP Vertica and MapR Solution for Optimized SQL-on-Hadoop
Chris Selland, Steve Wooledge, Walt Maguire / June 11, 2014
@cselland, @swooledge, @waltermaguire
#HPDiscover
2. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.2
The Time is Now for Big Data
Data Volumes
AccuracyandInsight
CRM ERP Data Warehouse Web Social Log Files Machine Data Semi-structured
Dark Data
Big DataTraditional
Enterprise Data
Unstructured
3. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.3
Manage the data explosion
New Style:
Affordable
Old Style:
Unaffordable
$$$
$$
$
TB PB EB
Cost
No limits Scale
Capture any form of data
At any scale from TB to EB
Maintaining high
performance
At affordable cost
4. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.4
HP Vertica: No limits, no compromises.
Gain insight into your data
50x-1,000x faster than
legacy products
Real-time analytics
Purpose built for Big Data from the first line of code
Infinitely scale your solution
by adding an unlimited
number of low cost nodes
Massive scalability
Built-in support for Hadoop,
R, and a range of ETL and
BI tools
Open architecture
Store 10x-30x more data per
server than row databases
with patented columnar
compression
Optimized data storage
Private Cloud Public Cloud ApplianceSoftware
Only
5. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.5
The Richest, Most Open SQL on Hadoop
Challenge: Extracting Data from Hadoop requires complex and
brittle ETL processes
SOLUTION: Hadoop Navigation and Analytics
Benefits:
• Navigate Hadoop data using its native catalog
• Quickly and easily load native data types from Hadoop to Vertica
• Avoid recreating schemas to explore external tables
• Use the full power of Vertica SQL and Analytics
• Choose your own Hadoop distribution
6. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.6
Images
Search engine
IT/OT
Documents
Transactional data
Mobile
Texts
Email
Audio
Social media
HP HAVEn: no limits to future success
Hadoop
Autonomy
IDOL
Vertica
Enterprise
Security
nApps
Catalog massive
volumes of
distributed data
Process and
index all
information
Analyze at
extreme scale in
real-time
Collect & unify
machine data
Powering
HP Software
+ your apps
Video
End-to-end Big
Data platform that
powers data-
driven decision
making in modern
enterprises
HAVEn
7. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.7
Ecosystem
8. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.8
HQ
MapR: WORLDWIDE HADOOP TECHNOLOGY LEADER
UNIQUELY ADDRESSES BOTH
ANALYTIC AND OPERATIONAL USE CASES
500+ PAYING CUSTOMERS
9. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.9
MapR: Best Product, Best Business & Best Customers
Top Ranked
Exponential
Growth
500+
Customers Cloud Leaders
3X bookings Q1 ‘13 – Q1 ‘14
80% of accounts expand 3X
90% software licenses
<1% lifetime churn
>$1B in incremental revenue
generated by 1 customer
10. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.10
MapR Distribution for HadoopManagement
MapR Data Platform
APACHE HADOOP AND OSS ECOSYSTEM
Security
YARN
Pig
Cascading
Spark
Batch
Spark
Streaming
Storm*
Streaming
HBase
Solr
NoSQL &
Search
Juju
Provisioning
&
Coordination
Savannah*
Mahout
MLLib
ML, Graph
GraphX
MapReduce
v1 & v2
EXECUTION ENGINES DATA GOVERNANCE AND OPERATIONS
Workflow
& Data
Governance
Tez*
Accumulo*
Hive
Impala
Shark
Drill*
SQL
Sentry* Oozie ZooKeeperSqoop
Knox* WhirrFalcon*Flume
Data
Integration
& Access
HttpFS
Hue
*
Cer&fica&on/support
planned
for
2014
• High availability
• Data protection
• Disaster recovery
• Standard file access
• Standard database
access
• Pluggable services
• Broad developer
support
• Enterprise security
authorization
• Wire-level
authentication
• Data governance
• Ability to support
predictive analytics,
real-time database
operations, and
support high arrival
rate data
• Ability to logically
divide a cluster to
support different
use cases, job
types, user groups,
and administrators
• 2X to 7X higher
performance
• Consistent, low
latency
Enterprise-grade Security OperationalPerformance Multi-tenancyInteroperability
HP-Vertica
11. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.11
Benefits of HP Vertica on MapR
Vertica
NFS
Vertica
NFS
Vertica
NFS
MapR Data Platform
Vertica
Files
Vertica
Files
Vertica
Files
• Disaster recovery
• Improved disk usage
• Snapshots/Backup
• Reduced Complexity
• Lower operational cost
• Faster local file access
• Easy capacity expansion
• Dynamic storage
utilization
Moving data costs money...
HP Vertica on MapR moves processing to data and utilizes the same hardware for both.
12. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.12
HP Vertica + MapR:
Best-of-Breed SQL on Hadoop
• 100% ANSI SQL compliance, fast performance, and advanced analytics
• Fastest, Most Open SQL-on-Hadoop
• Most Complete Analytics
• Lowest Total Cost of Ownership
• Enterprise-Grade Reliability
13. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Demo
14. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.14
ENTERPRISE
DATA HUB
MARKETING
OPTIMIZATION
RISK & SECURITY
OPTIMIZATION
OPERATIONS
INTELLIGENCE
• Multi-structured
data staging & archive
• ETL / DW
optimization
• Mainframe offload
• Data exploration
• Recommendation
engines & targeting
• Customer 360
• Click stream analysis
• Social media analysis
• Ad optimization
• Network security
monitoring
• Security information &
event management
• Fraudulent behavioral
analysis
• Supply chain & logistics
• System log analysis
• Manufacturing quality
assurance
• Preventative
maintenance
• Sensor analysis
Common Use Cases: Taking Advantage of Hadoop
15. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.15
20M
SONGS
16. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.16
Largest Biometric Database in the World
PEOPLE
1.2B
PEOPLE
17. ®
© 2014 MapR Technologies 17
HP: Clickstream Analysis
HP optimizes customer experience on corporate website
• Increase conversion on website through real-time, relevant responses
• Improve customer retention through interactive, personalized experiences
• Needed to store and analyze 5 years of clickstream generated on hp.com
• Required faster response times—queries took days with Oracle
• Complex analytics were impossible because of diverse data formats
• MapR manages 5 PB of data on dual 46-node clusters with 20 TB/node
• Clickstream data collected in Hadoop, analyzed in HP Vertica, direct
query for business metrics
• HP chose MapR for performance, high availability, disaster recovery,
manageability, knowledge base and future road map
OBJECTIVES
CHALLENGES
SOLUTION
• 10% increase in conversion of shoppers to buyers
• 40% increase in efficiency for analysts
• Analyst queries that used to take 24 hours to process now take 15 seconds
Business
Impact
19. ®
© 2014 MapR Technologies 19
How Does It Work?
Hybrid MapR + HP Vertica Solution for Clickstream Analytics
USER
ENGAGEMENT
SOURCE DATA
DATA SERVICES ZONE
CLIENT UI
ANALYTICS ZONE
WEB PAGE
TAG
COLLECTOR
HADOOP
DROPZONE
OTHER
DATA
MAPREDUCE
INGESTION Data Staging/
Unstructured Data Lake
SQL QUERY
ODBC / JDBC
HIVE QL
SECONDSIGHT
JAVASCRIPT, PHP
DATA ACCESS LAYER
STARGATE REST
HBASE
PAGE METRICS
ODBC / JDBC
HP VERTICA
5 YEARS OF RAW DETAIL CLICKSTREAM DATA
72 BILLION ROWS @30M ROWS/DAY
~145 TERABYTES (UNCOMPRESSED)
4 MONTHS SECONDSIGHT AGGREGATE DATA
PAGE STATISTICS TABLE: 200 MILLION ROWS
PAGE PATHING TABLES: 520 MILLION ROWS
®
DASHBOARDS
QLIKVIEW
SQL QUERY
23. ®
© 2014 MapR Technologies 23
Getting Started
Mapr.com/appgallery – HP
Vertica on MapR Sandbox for
Hadoop
Mapr.com/sandbox – Hadoop
sandbox with tutorials
24. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.24
HP Vertica Market Place
25. © Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Thank you
HP Vertica and MapR Solution for Optimized SQL-on-Hadoop
Chris Selland, Steve Wooledge, Walt Maguire / June 11, 2014
@cselland, @swooledge, @waltermaguire
#HPDiscover