How Experian increased insights with Hadoop

© 2015 MapR Technologies 2
Today’s Presenters
Bill Peterson
Director - Product Marketing
@thebillp
Jorge A. Lopez
Director - Product Marketing
@zanilli
Tom Thomas
Sr. Director – IT, Consumer
Information Services

Industry Leaders Compete and Win with Data1TREND
More Data Beats Better Algorithms
Collecting interaction data from ecommerce, social media, offline, and call centers
enables a “customer 360 view” and consumer intimacy
Competitive Advantage is Decided by 0.5%
Consumer financial services: 1% improvement in fraud detection means hundreds of millions of dollars
Advertising and retail: 0.5% improvement in lift means millions of dollars increase in profitability

Big Data is Overwhelming Traditional Systems
• Mission-critical reliability
• Transaction guarantees
• Deep security
• Real-time performance
• Backup and recovery
• Interactive SQL
• Rich analytics
• Workload management
• Data governance
• Backup and recovery
Enterprise
Data
Architecture
2TREND
ENTERPRISE
USERS
OPERATIONAL
SYSTEMS
ANALYTICAL
SYSTEMS
PRODUCTION
REQUIREMENTS
PRODUCTION
REQUIREMENTS
OUTSIDE SOURCES

OPERATIONAL
SYSTEMS
ANALYTICAL
SYSTEMS
ENTERPRISE
USERS
1REALITY
• Data staging
• Archive
• Data transformation
• Data exploration
• Streaming,
interactions
Hadoop Relieves the Pressure from Enterprise Systems
2 Interoperability
1 Reliability and DR
4
Supports operations
and analytics
3 High performance
Keys for Production Success

Architecture Matters for Success2REALITY
FOUNDATION

FOUNDATION
Architecture Matters for Success2REALITY
Data protection
& security
High performance
Multi-tenancy
Real-time operational
& analytical apps
Open standards
for integration
NEW APPLICATIONS SLAs TRUSTEDINFORMATION LOWERTCO

The Power of the Open Source Community
APACHE HADOOP AND OSS ECOSYSTEM
Security
YARN
Spark
Streaming
Storm
StreamingNoSQL &
Search
Juju
Provisioning
&
Coordination
Sahara
ML, Graph
Mahout
MLLib
GraphX
EXECUTION ENGINES DATA GOVERNANCE AND OPERATIONS
Workflow
& Data
Governance
Pig
Cascading
Spark
Batch
MapReduce
v1 & v2
Tez
HBase
Solr
Hive
Impala
Spark SQL
Drill
SQL
Sentry Oozie ZooKeeperSqoop
Flume
Data
Integration
& Access
HttpFS
Hue
Data PlatformMapR-FS MapR-DB
Management

The MapR Distribution including Apache Hadoop
APACHE HADOOP AND OSS ECOSYSTEM
Security
YARN
Spark
Streaming
Storm
StreamingNoSQL &
Search
Juju
Provisioning
&
Coordination
Sahara
ML, Graph
Mahout
MLLib
GraphX
EXECUTION ENGINES DATA GOVERNANCE AND OPERATIONS
Workflow
& Data
Governance
Pig
Cascading
Spark
Batch
MapReduce
v1 & v2
Tez
HBase
Solr
Hive
Impala
Spark SQL
Drill
SQL
Sentry Oozie ZooKeeperSqoop
Flume
Data
Integration
& Access
HttpFS
Hue
Data PlatformMapR-FS MapR-DB
Management
Data HubEnterprise Grade Operational

MapR: Best Solution for Customer Success
Premier
Investors
High Growth
2X Growth In Direct Customers
90% Subscription Licenses
Software Margins
140% Dollar-based Net Expansion
700+
Customers
2X Growth In Annual
Subscriptions ( ACV)
Best Product
Apache Open Source

MapR and Syncsort Reference Architecture
Sources
RELATIONAL,
SAAS,
MAINFRAME
DOCUMENTS,
EMAILS
LOG FILES,
CLICKSTREAMS
BLOGS,
TWEETS,
LINK DATA
DATA MARTS DATA WAREHOUSE
MapR Data Platform
Business
Intelligence /
Visualization
MapR-DB MapR-FS
Batch
(MR, Spark, Hive, Pig,
…)
Interactive
(Impala, Drill, …)
Streaming
(Spark Streaming,
Storm…)
MAPR DISTRIBUTION FOR HADOOP

Achieving Operational Efficiencies with Hadoop
61%
The most popular workloads being shifted are
large-scale data transformations
Practitioners who have shifted one or more
workloads from legacy data warehouses or
mainframes to Hadoop!

The Hadoop Adoption Challenge
> hadoop fs -put

A Complete Solution to Harness the Power of Hadoop

Break Free from Hadoop Complexity
Design Once, Deploy Anywhere!
• Visually design data transformations once, and run anywhere
• No changes or tuning required
• Combine new and legacy sources for bigger insights
• Intelligent Execution Layer dynamically optimizes the job for each platform: Hadoop,
Windows, Unix, Linux or Cloud
• Future-proof your applications!
Intelligent
ExecutionLayer
Windows, Linux, Unix
Hadoop
Cloud

One-step Access to All Your Data
Build Your Enterprise Data Hub
Hadoop + DMX-h
Avro
Parquet
Cassandra
MongoDB
Mainframe
Vertica
Oracle
Teradata
Netezza
JSON HBaseFiles
Cloud
• Collect virtually any data from mainframe to Big Data and NoSQL sources
• Load data directly into Avro & Parquet. No staging required
• Access & translate mainframe data using Sqoop and Spark
• Let DMX-h dynamically split the data and load it to HDFS in parallel

Make Data Available to Business Analysts
Achieve the Fastest Path from Raw Data to Insight
• Create Tableau & Qlikview files with one click
• Achieve the fastest data loads without tuning hassles:
• Fastest parallel loads to Greenplum, Netezza, Teradata & Vertica
• High-performance connectivity to Big Data & NoSQL databases such as
Cassandra, Hbase & MongoDB
Hadoop + DMX-h
NoSQL

Accelerate EDW Offload Initiatives with SILQ
Up to 20x shorter development time!
• Web-based utility
• Takes SQL as an input
• Provides visual analysis of SQL ELT jobs
• Generates metadata and data migration
with DMX jobs
• Supports ANSI-SQL 2011, BTEQ, Netezza,
Oracle PL/SQL

MapR + Syncsort Solutions
Data Warehouse
Optimization
Click-stream
Analysis
Mainframe Offload
Shift ELT Workloads
to Hadoop
Access, Translate & Analyze
Mainframe Data with Hadoop
Collect, Process & Analyze More
Data from Your Website

Experience More!
1. Listen to this webcast on demand: http://bit.ly/1y1z0Ex
2. Download the MapR Sandbox for Hadoop: www.mapr.com/sandbox
3. Sign up for a free DMX-h test drive: www.syncsort.com/mapr

How Experian increased insights with Hadoop

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a How Experian increased insights with Hadoop

Semelhante a How Experian increased insights with Hadoop (20)

Mais de Precisely

Mais de Precisely (20)

Último

Último (20)

How Experian increased insights with Hadoop

Notas do Editor