Mais conteúdo relacionado Mais de Cloudera, Inc. (20) Big Value with Big Customer Data1. 1© Cloudera, Inc. All rights reserved.
Amy O’Connor
Big Data Evangelist / Business Value Enablement
Big Value with Big Customer Data
2. 2© Cloudera, Inc. All rights reserved.
Real World Impact
The great value of data
Top Cancer Research
Institutions
Working to Cure
Cancer
Rocket Science
Thorn
Destroying Human
Trafficking Networks
Manning the Orion spacecraft
as it orbits the earth
3. 3© Cloudera, Inc. All rights reserved.
Cloudera Fast Facts: An Innovative Technology Company
2008
Founded by former employees of
2009
First commercial Hadoop product
1200+ Employees
2300+ Partners
~$1B Investment
5. 5© Cloudera, Inc. All rights reserved.
The Original Inspirations for Hadoop
2003 2004
6. 6© Cloudera, Inc. All rights reserved.
2006
Core Hadoop
(HDFS,
MapReduce)
The Beginning: Building Hadoop
7. 7© Cloudera, Inc. All rights reserved.
2006 2008 2009 2010 2011 2012 2013
HBase
ZooKeeper
Solr
Pig
Core Hadoop
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
Core Hadoop
Sqoop
Avro
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
Core Hadoop
Flume
Bigtop
Oozie
HCatalog
Hue
Sqoop
Avro
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
YARN
Core Hadoop
Spark
Tez
Impala
Kafka
Drill
Flume
Bigtop
Oozie
HCatalog
Hue
Sqoop
Avro
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
YARN
Core Hadoop
Parquet
Sentry
Spark
Tez
Impala
Kafka
Drill
Flume
Bigtop
Oozie
HCatalog
Hue
Sqoop
Avro
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
YARN
Core Hadoop
2007
Solr
Pig
Core Hadoop
Knox
Flink
Parquet
Sentry
Spark
Tez
Impala
Kafka
Drill
Flume
Bigtop
Oozie
HCatalog
Hue
Sqoop
Avro
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
YARN
Core Hadoop
2014 2015
Kudu
RecordService
Ibis
Falcon
Knox
Flink
Parquet
Sentry
Spark
Tez
Impala
Kafka
Drill
Flume
Bigtop
Oozie
HCatalog
Hue
Sqoop
Avro
Hive
Mahout
HBase
ZooKeeper
Solr
Pig
YARN
Core Hadoop
Core Hadoop
(HDFS,
MapReduce)
A Decade of Hadoop
8. 8© Cloudera, Inc. All rights reserved.
Our relationship
with data
is changing.
Hadoop Technology enables
new ways of working.
9. 9© Cloudera, Inc. All rights reserved.
Requirements necessary to drive value from data
1. Economically feasible to store more data
2. Powered to predictably process large data sets
3. Ability to build your data asset at linear scale
4. Collect data in native format – enables agility
5. Build history of activity by collecting data prior to its use
6. You can have near real-time access to data, plus a view of history
7. Security at the data layer increases flexibility and ability to protect privacy
8. Create community data and use machine learning to drive innovation
Extreme performance
and efficiency
Analytic agility
10. 10© Cloudera, Inc. All rights reserved.
Merging real-time &
archived data
Structured with
unstructured
External and internal
sources
Data stays where it’s born
Not all can be in the cloud
Partnerships with Amazon,
Microsoft & Google
Native Encryption
Access Control
Data Governance
Regulatory Compliance
Advanced Analytics Hybrid Cloud Data SecurityMulti-Workload
Batch Computation
Interactive SQL
Machine Learning
Stream Processing
Search
In-memory
The ground has shifted, from “Storage + Compute” to:
11. 11© Cloudera, Inc. All rights reserved.
Our relationship
with data
is changing.
From balance to blend:
personal & professional lives.
Understanding the customer journey
12. 12© Cloudera, Inc. All rights reserved.
How many here sleep with your smartphone?
• One in three Australians sleep with
their smartphones
Source: Deloitte: Mobile Consumer Survey 2015
– The Australian Cut
From balance to blend: Personal & Professional Lives
• More than half the population checks
their smartphone within 15 minutes of
waking
• More than 88% of Australians use their
smartphones when talking to friends
and 92% of Australians use their
smartphones at work (92%)
13. 13© Cloudera, Inc. All rights reserved.
From balance to blend: Personal & Professional Lives
Source: September 10, 2014|by Tasha Keeney, ARK Analyst|Devices / Gateways
Tablet
Smartphone
Internet
TV
70+ Telco / Internet Providers
286PB Data
14. 14© Cloudera, Inc. All rights reserved.
• New insights into customer
behavior, abandoned online
shopping cart behavior with
unified customer data
• Marketing spend
optimization through
channel attribution analysis
• Improve supply chain and
reduce inventory costs
• Improved ability to predict
returns
DRIVE CUSTOMER
INSIGHTS
The Customer
Journey
15. 15© Cloudera, Inc. All rights reserved.
• Leveraging EDH and predictive
modeling, help clients optimize
market, channel and offer.
• With their solution, customers
can auto serve offers tailored
by consumer behavior,
preference and transaction
history.
• Digital Alchemy customers
include Virgin Mobil, Spark,
RACQ, ASB and Rabbit
Rewards.
DRIVE CUSTOMER
INSIGHTS
Right Time, Right
Offer, Right Channel
16. 16© Cloudera, Inc. All rights reserved.
Data and the
Sharing Economy
“Everything that we do in
engineering is about creating
great matches between people”.
Machine Learning drove up
booking rates by 4% - with first
experiment.
17. 17© Cloudera, Inc. All rights reserved.
Our relationship
with data
is changing.
From separate to converged
digital and physical worlds.
Building better products & services
• Internet of Things (IoT)
• Smart Cities
• Augmented Reality
• Virtual Reality
• Precision Medicine
• Precision Energy
• Automated Logistics
• 3D printing
18. 18© Cloudera, Inc. All rights reserved.
Source- Ovum: Understanding the IoT
Opportunity: An Industry Perspective -
2015
Building better products with data from IoT
19. 19© Cloudera, Inc. All rights reserved.
• Deeper analytics from
customer profile data
• 50% increase in customer
retention while 2x
increase in policies issued
• Ability to analyze 10s of
millions of quotes in
under a minute
• $5 million claim cost
reduction
through fraud prevention
Connected cars
20. 20© Cloudera, Inc. All rights reserved.
Moving people
Discount people’s
previous experience but
put a heavy premium on
their ability to solve the
problems that your
business has.
Heavy emphaisis on the
combination of creative
and analytical skill sets.
21. 21© Cloudera, Inc. All rights reserved.
• Virtuous cycle: Identify
features that facilitate sharing
of content that drive new
customers
• Real-time streaming and batch
data from product logs, web
analytics, channel data and
ERP
•Impala connects to third-party
data wrangling and BI tools for
fast reporting
Sharing entertainment
experiences
22. 22© Cloudera, Inc. All rights reserved.
• Monitor the health of
180,000+ trucks in real-time :
• OnCommand Connection
collecting telematics and
geolocation data across
thousands of trucks
• Identify and correct engine
problems early, and increase
fleet uptime
• Reduced maintenance costs
to $.03 per mile from $.12-
$.15 per mile
Connected logistics
23. 23© Cloudera, Inc. All rights reserved.
The IT world
is changing.
Disaggregated, distributed
On-prem, hybrid, cloud
Fast, easy, secure
24. 24© Cloudera, Inc. All rights reserved.
Hadoop deployments in cloud are accelerating:
● Executive mandate: minimize on-prem datacenter
footprint
● Data provenance and data gravity
● Increased agility: end-user self-service
● Elasticity: optimize infrastructure usage
● Perceived lower overall TCO
What’s driving data to Cloud and Hybrid Cloud?
Enterprise customers using cloud for big data analytics
25. 25© Cloudera, Inc. All rights reserved.
Hadoop Expertise
◆ Most committers
◆ World-class innovation
◆ Enterprise-class stack
◆ Granular data security +
governance
◆ Best support, services, training
Flexible Deployments
◆ No vendor lock-in
◆ Multi-cloud and on-prem
◆ Transient and long-lived
clusters
Superior security
Security separation from
infrastructure leads to greater
choice
Flexible Pricing
◆ Pay-as-you-go cloud usage
◆ Traditional node-based
licensing
Why Cloudera in the Cloud: fast, easy, secure
CDH is the most deployed distro in the cloud
26. 26© Cloudera, Inc. All rights reserved.
• Redshift “General Purpose Schema” - less modeled schema for general-purpose usage
• Redshift “Fixed Reporting” – fixed-purpose schema tuned for this specific test workload
Exploratory BI can be
slow on Redshift
Impala 4-10x faster than Redshift General Purpose
Impala 42-90% faster than Redshift Fixed Reporting
More Performant: Impala SQL on both EBS & S3
Multi-user queries
27. 27© Cloudera, Inc. All rights reserved.
• Redshift “General Purpose Schema” - schema for general-purpose usage
• Redshift “Fixed Reporting” – fixed-purpose schema tuned for this specific test workload
Impala >200% cheaper than Redshift General Purpose
Impala 8-28% cheaper than Redshift Fixed Reporting
Exploratory BI can be
expensive on Redshift
More cost effective: Impala SQL on both EBS & S3
ETL + Multi-user queries
28. 28© Cloudera, Inc. All rights reserved.
During the Rio Summer Olympics,
delivered 2.7 million emails across
108 campaigns that were triggered by
audience behavior with 29% increase
in ave minutes streamed.
•Enables varied & complex data to be
stored for highly variable events
•Provides extreme flexibility:
Provisioned extra nodes a wk before
games
•Data was used in 7 Olympic App,
and to track content on 200
variables to run email campaigns
CUSTOMER 360
Audience Engagement in
the Cloud
29. 29© Cloudera, Inc. All rights reserved.
Crunching 1,000+ Business Metrics
per Customer with Sub-Second
Responses
•Enables granular targeting of
customers
•50% reduction in marketing cost
execution at one
•Stores & processes 1000s of
critical events at scale & low cost
•Provides flexibility, agility to
support customer needs with
Cloudera on Amazon Web
Services and on premises
CUSTOMER 360
Customer 360° in the
Cloud
30. 30© Cloudera, Inc. All rights reserved.
Preventative Maintenance
• To improve traveler
satisfaction and safety, a
European needed to reduce
downtime for critical
operational machines
• Cloudera Enterprise on Azure
captures and correlates
sensor data with
transactional data to
proactively assess the health
of its machines and deliver
necessary fixes to prevent
failure
Flying Safer
31. 31© Cloudera, Inc. All rights reserved.
• Collect and analyze data from
from thousands of diverse
manufacturing systems in
real-time
• iTrak application using
Cloudera on Azure to monitor
the performance of
individual manufacturing
systems in real-time
• Predictive Maintenance -
Proactively identifying &
fixing issues before they
break
Industrial IoT
32. 32© Cloudera, Inc. All rights reserved.
• 30 Billion events/data in
Market graph database
built on Cloudera & AWS
• Rapid, interactive access
to 2+ years’ data
• Operational efficiencies
resulting from the
platform’s scalability result
in net annual savings of
$20 million
Financial
Compliance
33. 33© Cloudera, Inc. All rights reserved.
Best Practice to Successful
Hadoop Adoption
By 2017,
Gartner “Predicts 2015: Big Data Challenges Move From Technology to the
Organization” – November 2014
of big data projects will fail
to go beyond the pilot phase60%
34. 34© Cloudera, Inc. All rights reserved.
Our Most Successful Customers do these Five Things
1. Build a Big Data Culture
Led by an enabled executive sponsor(s). Communication methodologies. Advocating change.
2. Assemble the right team
Tightly aligned team. Mix of seasoned experts and innovators
3. Become lean and iterative for data engineering, data science, analysis
Successful projects start small, fail often and iterate to success approach. Roadmaps:
Document expected direction, yet expect insights to create change
4. Efficiently operationalize insights
Analytics -> Reports, Big Data -> Actions. Create a bridge between Dev and Ops
5. Govern the Data
Rightsize and iteratively building towards maturity.
35. 35© Cloudera, Inc. All rights reserved.
Get Data
Explore
and Analyze
Deploy
1. Get data you already have, or
create new data.
2. Explore and analyze, quickly.
3. Deploy your application.
…and repeat. Add:
More data, more users, more use cases,
more complex analytics; go real-time!
Think Big.
Start Small.
Iterate to Success
36. 36© Cloudera, Inc. All rights reserved.
Product
Innovation
Open Source,
Open Standards
Training
Services
Customer
Success
Proactive,
Predictive
SupportPartner
Ecosystem
Cloudera is your global Big Data Partner
37. 37© Cloudera, Inc. All rights reserved.
Getting started is easy. Then iterative to success.
① ②
Download or Deploy
in the Cloud
Signup for Training Contact us or a Partner
to Start a Pilot Project
③