SlideShare uma empresa Scribd logo
1 de 23
Baixar para ler offline
Page 1 Hortonworks © 2014
Distilling Hadoop Patterns of Use
Shaun Connolly, Hortonworks
@shaunconnolly
March 25, 2014
Page 2 Hortonworks © 2014
Our Mission:
Our Commitment
Open Leadership
Drive innovation in the open exclusively via the
Apache community-driven open source process
Enterprise Rigor
Engineer, test and certify Apache Hadoop with
the enterprise in mind
Ecosystem Endorsement
Focus on deep integration with existing data
center technologies and skills
Headquarters: Palo Alto, CA
Employees: 300+ and growing
Reseller Partners
Enable your Modern Data Architecture by
Delivering Enterprise Apache Hadoop
Page 3 Hortonworks © 2014
Data Continues to Grow Sharply
2020:	
  
Digital	
  universe	
  =	
  40	
  Ze'abytes	
  	
  
2012:	
  
Digital	
  universe	
  =	
  20	
  Ze'abytes	
  
1	
  Ze2abyte	
  (ZB)	
  =	
  1	
  billion	
  Terabytes	
  (TB)	
  	
  
2014:	
  
31%	
  of	
  enterprises	
  managing	
  more	
  than	
  1	
  Petabyte	
  
Social	
  
Networks	
  
Machine	
  
Generated	
  
Documents,	
  	
  
Emails	
  
OLTP,	
  ERP,	
  	
  
CRM	
  Systems	
  
Geoloca@on	
  
Data	
  
Sensor	
  
Data	
  
Web	
  Logs,	
  
Click	
  Streams	
  
85%	
  of	
  growth	
  from	
  new	
  types	
  of	
  
data	
  with	
  machine-­‐generated	
  
data	
  increasing	
  15x	
  
Sources:	
  IDC	
  and	
  IDG	
  Enterprise	
  
Page 4 Hortonworks © 2014
Cameras and
microphones widely
deployed
New routes to market via
intelligent objects
Content and services
via connected
products
Everything
has a URL
Remote sensing of
objects and environment
Augmented
reality
Situational
decision support
Building and
infrastructure management
Over 50% of Internet connections are things:
2011: 15+ billion permanent, 50+ billion intermittent
2020: 30+ billion permanent, >200 billion intermittent
Source: Gartner Keynote at Hadoop Summit 2013
Page 5 Hortonworks © 2014
Harnessing Big Data is
transformational to business models
Enables the move from post-transaction,
reactive analysis of subsets of data stored in
silos to a world of pre-transaction, interactive
insights across all data that impacts both the top
and bottom lines
Page 6 Hortonworks © 2014
DATA	
  SYSTEMS	
  APPLICATIONS	
  
Repositories	
  
ROOMS
Sta@s@cal	
  
Analysis	
  
BI	
  /	
  Repor@ng,	
  
Ad	
  Hoc	
  Analysis	
  
Interac@ve	
  Web	
  
&	
  Mobile	
  Applica@ons	
  
Enterprise	
  
Applica@ons	
  
EDW MPPRDBMS	
   EDW	
   MPP	
  
Governance	
  	
  
&	
  Integra=on	
  
Security	
  
Opera=ons	
  
Data	
  Access	
  
Data	
  Management	
  
SOURCES	
  
OLTP,	
  ERP,	
  
CRM	
  Systems	
  
Documents,	
  	
  
Emails	
  
Web	
  Logs,	
  
Click	
  Streams	
  
Social	
  
Networks	
  
Machine	
  
Generated	
  
Sensor	
  
Data	
  
Geoloca@on	
  
Data	
  
Modern Data Architecture with Hadoop
OPERATIONS	
  TOOLS	
  
Provision,
Manage &
Monitor
DEV	
  &	
  DATA	
  TOOLS	
  
Build &
Test
ENTERPRISE HADOOP
Page 7 Hortonworks © 2014
MDA Unlocks New Approach to Insight
Enterprise	
  Hadoop	
  
Mul@ple	
  Query	
  Engines	
  
Itera@ve	
  Process:	
  Explore,	
  Transform,	
  Analyze	
  
SQL	
  
Single	
  Query	
  Engine	
  
Repeatable	
  Linear	
  Process	
  
Determine	
  
list	
  of	
  
ques@ons	
  
Current	
  Approach	
  
	
  
Apply	
  schema	
  on	
  write	
  
	
  
Dependent	
  on	
  IT	
  
Augment	
  with	
  Hadoop	
  
	
  
Apply	
  schema	
  on	
  read	
  
	
  
Support	
  range	
  of	
  access	
  paRerns	
  to	
  data	
  stored	
  in	
  HDFS	
  
Design	
  
solu@ons	
  
Collect	
  
structured	
  
data	
  
Ask	
  
ques@ons	
  
from	
  list	
  
Detect	
  
addi@onal	
  
ques@ons	
  
Batch	
   Interac@ve	
   Real-­‐@me	
   Streaming	
  
Page 8 Hortonworks © 2014
Schema-on-Write vs. Schema-on-Read
Standard Digital Camera
§ Zoom & focus first
§ Capture limited set of pixels
§ Crop around the focused area
Lytro Lightfield Camera
§ Capture entire lightfield
§ Infinite zoom & focus
§ Crop any captured areas
Page 9 Hortonworks © 2014
MDA Uses Commodity Compute + Storage
$0 $20,000 $40,000 $60,000 $80,000 $180,000
Cloud Storage
HADOOP
NAS
Engineered System
Hadoop Enables Scalable
Compute & Storage at a
Compelling Cost Structure
Fully Loaded Cost per Raw TB of Data (min – max cost)
EDW/MPP
SAN
Page 10 Hortonworks © 2014
MDA Optimizes Data Warehouse
Analytics
20%
ETL Process
30%
Operations
50%
Current Reality
§  EDW at capacity; some usage
from low value workloads
§  Older transformed data
archived, unavailable for
ongoing exploration
§  Source data often discarded
Operations
50%
Analytics
50%
HADOOP
Parse, cleanse,
apply structure, transform
Augment with Hadoop
§  Free up EDW resources from low
value tasks
§  Keep 100% of source data and
historical data for ongoing exploration
§  Mine data for value after loading it
because of schema-on-read
Page 11 Hortonworks © 2014
Integrating with Existing InvestmentsAPPLICATIONS	
  DATA	
  SYSTEM	
  SOURCES	
  
RDBMS	
   EDW	
   MPP	
  
Emerging	
  Sources	
  	
  
(Sensor,	
  Sen=ment,	
  Geo,	
  Unstructured)	
  
HANA
BusinessObjects BI
OPERATIONAL	
  TOOLS	
  
DEV	
  &	
  DATA	
  TOOLS	
  
Exis=ng	
  Sources	
  	
  
(CRM,	
  ERP,	
  Clickstream,	
  Logs)	
  
INFRASTRUCTURE	
  
Page 12 Hortonworks © 2014
Powering the Modern Data Architecture
	
  	
  
Enables	
  deep	
  
insight	
  across	
  a	
  
large,	
  broad,	
  
diverse	
  set	
  of	
  data	
  
at	
  efficient	
  scale	
  	
  
Mul=-­‐Use	
  Data	
  PlaSorm	
  
Store	
  all	
  data	
  in	
  one	
  place,	
  process	
  in	
  many	
  ways	
  
1	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
  
°	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
  
°	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
  
°	
  
°	
  
°	
  
°	
  
°	
  
°	
  
°	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
  
°	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
  
°	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
   °	
  
°	
  
°	
  
°	
  
°	
  
°	
  
n	
  
Batch	
   Interac=ve	
   Real-­‐=me	
   Streaming	
  
Data Lake that contains ALL data;
raw sources and any processed data
over extended periods of time.
YARN	
  :	
  Data	
  Opera=ng	
  System	
  
Page 13 Hortonworks © 2014
How	
  Hadoop?	
  
	
  
“Hadoop	
  can	
  be	
  used	
  to	
  create	
  a	
  ‘data	
  lake’	
  –	
  an	
  integrated	
  
repository	
  of	
  data	
  from	
  internal	
  and	
  external	
  data	
  sources...	
  
Data	
  combined	
  from	
  mulVple	
  silos	
  can	
  help	
  your	
  organizaVon	
  
find	
  answers	
  to	
  complex	
  quesVons	
  that	
  no	
  one	
  has	
  previously	
  
dared	
  ask	
  or	
  known	
  how	
  to	
  ask.”	
  	
  
	
   	
  -­‐-­‐	
  Forrester	
  
Page 14 Hortonworks © 2014
The Common Journey with Hadoop
SCALE
SCOPE
More data and
analytic apps
New Analytic Apps
New types of data
LOB-driven
A Modern Data Architecture
	
   RDBMS
MPP
EDW
Governance
&Integration
Security
Operations
Data Access
Data Management
Page 15 Hortonworks © 2014
Unlock Value in New Types of Data
1.  Social
Understand how people are feeling and interacting –
right now
2.  Clickstream
Capture and analyze website visitors’ data trails and
optimize your website
3.  Sensor/Machine
Discover patterns in data streaming from remote
sensors and machines
4.  Geographic
Analyze location-based data to manage operations
where they occur
5.  Server Logs
Diagnose process failures and prevent security
breaches
6.  Unstructured (txt, video, pictures, etc..)
Understand patterns in files across millions of web
pages, emails, and documents
Value
+ Online archive
Data that was once purged or moved
to tape can be stored in Hadoop to
discover long term trends and
previously hidden value
Page 16 Hortonworks © 2014
20 Business Applications of Hadoop
Industry Use Case Type of Data
Financial Services
New Account Risk Screens Text, Server Logs
Trading Risk Server Logs
Insurance Underwriting Geographic, Sensor, Text
Telecom
Call Detail Records (CDRs) Machine, Geographic
Infrastructure Investment Machine, Server Logs
Real-time Bandwidth Allocation Server Logs, Text, Social
Retail
360° View of the Customer Clickstream, Text
Localized, Personalized Promotions Geographic
Website Optimization Clickstream
Manufacturing
Supply Chain and Logistics Sensor
Assembly Line Quality Assurance Sensor
Crowdsourced Quality Assurance Social
Healthcare
Use Genomic Data in Medical Trials Structured
Monitor Patient Vitals in Real-Time Sensor
Pharmaceuticals
Recruit and Retain Patients for Drug Trials Social, Clickstream
Improve Prescription Adherence Social, Unstructured, Geographic
Oil & Gas
Unify Exploration & Production Data Sensor, Geographic & Unstructured
Monitor Rig Safety in Real-Time Sensor, Unstructured
Government
ETL Offload in Response to Federal Budgetary Pressures Structured
Sentiment Analysis for Government Programs Social
Page 17 Hortonworks © 2014
360° Customer View for Home Supply Retailer
Problem
Disjoint customer engagement across all channels
Data repositories on website traffic, POS transactions and in-
home services exist in separate silos
Unable to perform analytics on customer buying behavior
across all channels
Limited ability for targeted marketing to specific segments
Solution
Unified system of engagement via “golden record”
Golden record enables targeted marketing capabilities:
customized coupons, promotions and emails
Deep visibility into all customers and all market segments
Unlocks rich, informed cross-sell & up-sell opportunities
Creating Opportunity
Data: Clickstream,
Unstructured, Structured
Retail
Major home
improvement retailer
>$74B in revenue
>300K employees
>2,200 stores
Page 18 Hortonworks © 2014
Monetize Anonymous & Aggregate Banking Data
Problem
Unable to unlock valuable cross-sell banking data
Bank possesses data that indicates larger macro-economic
trends, which can be monetized in secondary markets
Data sets are isolated in legacy silos controlled by LOBs
Regulations and company policies protect customer privacy
IT challenged by joining data while guaranteeing anonymity
Solution
Create cross-LOB data lake of de-identified data
Mortgage bankers, consumer bankers, credit card group and
treasury bankers have access to the same cross-sell data
Single point of security & privacy for de-identification, masking,
encryption, authentication and access control
Interoperability with SAS, Red Hat & Splunk
Creating Opportunity
Data: Structured,
Clickstream, Social &
Unstructured
Banking
One of the largest
US banks
Page 19 Hortonworks © 2014
Improving Efficiency
Data: SensorOptimize High-Tech Manufacturing
Problem
Ineffective root cause analysis on product defects
200 million digital storage devices manufactured yearly
>10K faulty devices returned by customers every month
Limited data available for root cause analysis means that
diagnosing problems is highly manual (physical inspections)
Subset of sensor data from QA testing retained 3-12 months
Solution
Created sensor data lake for 10x quality improvement
Repository holds 24 months of data for each device
Manufacturing dashboard allows >1,000 employees to search
data, with results returned in less than 1 second
Quality improved 10x: rate down to ~1K faulty devices / month
Manufacturing
Digital Storage
Devices
>$15B in revenue
>85K employees
Page 20 Hortonworks © 2014
Think Pigabyte, Not Petabyte
Page 21 Hortonworks © 2014
Enabling Hadoop for the Enterprise Journey
Capabili=es	
  
Ensure	
  enterprise	
  capabili@es	
  
are	
  delivered	
  in	
  100%	
  open	
  
source	
  to	
  benefit	
  all	
  
1
2Integra=on	
  
Interoperable	
  with	
  exis@ng	
  	
  
data	
  center	
  investments	
  
Skills	
  
Leverage	
  your	
  exis@ng	
  skills:	
  
development,	
  analy@cs,	
  
opera@ons	
  	
  3
Scale
Scope
More data and
analytic apps
New Analytic Apps
New types of data
LOB-driven
A Modern Data Architecture
	
   RDBMS
MPP
EDW
Governance
&Integration
Security
Operations
Data Access
Data Management
Page 22 Hortonworks © 2014
Try Hadoop Today… Get Involved
Download the Hortonworks Sandbox
Learn Hadoop
Build Your Analytic App
Try Hadoop 2
San Jose, CA
June 3 - 5, 2014
REGISTER NOW
Amsterdam
April 2 - 3, 2014
REGISTER NOW
Page 23 Hortonworks © 2014
Questions?
@shaunconnolly

Mais conteúdo relacionado

Mais procurados

2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_securityAdam Muise
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Shravan (Sean) Pabba
 
Curb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure ClusterCurb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure Clusterahortonworks
 
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Kevin Minder
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big DataRommel Garcia
 
Securing the Hadoop Ecosystem
Securing the Hadoop EcosystemSecuring the Hadoop Ecosystem
Securing the Hadoop EcosystemDataWorks Summit
 
Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayDataWorks Summit
 
Hadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happyHadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happyDataWorks Summit
 
Hadoop ClusterClient Security Using Kerberos
Hadoop ClusterClient Security Using KerberosHadoop ClusterClient Security Using Kerberos
Hadoop ClusterClient Security Using KerberosSarvesh Meena
 
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...Abhiraj Butala
 
Hadoop Security, Cloudera - Todd Lipcon and Aaron Myers - Hadoop World 2010
Hadoop Security, Cloudera - Todd Lipcon and Aaron Myers - Hadoop World 2010Hadoop Security, Cloudera - Todd Lipcon and Aaron Myers - Hadoop World 2010
Hadoop Security, Cloudera - Todd Lipcon and Aaron Myers - Hadoop World 2010Cloudera, Inc.
 
Nl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenchesNl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenchesBolke de Bruin
 
Hadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happyHadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happyAnurag Shrivastava
 
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?DataWorks Summit
 
Hadoop Security: Overview
Hadoop Security: OverviewHadoop Security: Overview
Hadoop Security: OverviewCloudera, Inc.
 
The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014Cloudera, Inc.
 
TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Rangertrihug
 

Mais procurados (20)

2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_security
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015
 
Curb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure ClusterCurb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure Cluster
 
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
 
Securing the Hadoop Ecosystem
Securing the Hadoop EcosystemSecuring the Hadoop Ecosystem
Securing the Hadoop Ecosystem
 
Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox Gateway
 
Hadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happyHadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happy
 
An Approach for Multi-Tenancy Through Apache Knox
An Approach for Multi-Tenancy Through Apache KnoxAn Approach for Multi-Tenancy Through Apache Knox
An Approach for Multi-Tenancy Through Apache Knox
 
Hadoop ClusterClient Security Using Kerberos
Hadoop ClusterClient Security Using KerberosHadoop ClusterClient Security Using Kerberos
Hadoop ClusterClient Security Using Kerberos
 
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...
 
Hadoop Security, Cloudera - Todd Lipcon and Aaron Myers - Hadoop World 2010
Hadoop Security, Cloudera - Todd Lipcon and Aaron Myers - Hadoop World 2010Hadoop Security, Cloudera - Todd Lipcon and Aaron Myers - Hadoop World 2010
Hadoop Security, Cloudera - Todd Lipcon and Aaron Myers - Hadoop World 2010
 
Nl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenchesNl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenches
 
Hadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happyHadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happy
 
Hadoop Security
Hadoop SecurityHadoop Security
Hadoop Security
 
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
 
Hadoop Security: Overview
Hadoop Security: OverviewHadoop Security: Overview
Hadoop Security: Overview
 
Apache Ranger
Apache RangerApache Ranger
Apache Ranger
 
The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014
 
TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Ranger
 

Destaque

DataAnalysis_Yan_BookReviewCropSci2014
DataAnalysis_Yan_BookReviewCropSci2014DataAnalysis_Yan_BookReviewCropSci2014
DataAnalysis_Yan_BookReviewCropSci2014Manjit Kang
 
TDWI Solution Summit San Diego 2014 Advanced Analytics at Macys.com
TDWI Solution Summit San Diego 2014 Advanced Analytics at Macys.comTDWI Solution Summit San Diego 2014 Advanced Analytics at Macys.com
TDWI Solution Summit San Diego 2014 Advanced Analytics at Macys.comDaqing Zhao
 
Big data and the transport societal challenge - Maxime Flament
Big data and the transport societal challenge - Maxime FlamentBig data and the transport societal challenge - Maxime Flament
Big data and the transport societal challenge - Maxime FlamentBigData_Europe
 
Rapid JCR applications development with Sling
Rapid JCR applications development with SlingRapid JCR applications development with Sling
Rapid JCR applications development with SlingBertrand Delacretaz
 
Ast 0060878 wayne-eckerson_research_report_big_data_analytics
Ast 0060878 wayne-eckerson_research_report_big_data_analyticsAst 0060878 wayne-eckerson_research_report_big_data_analytics
Ast 0060878 wayne-eckerson_research_report_big_data_analyticsAccenture
 
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...Enrico Palumbo
 
Big Data Analytics: Architectural Perspective
Big Data Analytics: Architectural PerspectiveBig Data Analytics: Architectural Perspective
Big Data Analytics: Architectural PerspectiveSumit Kalra
 
A big-data architecture for real-time analytics
A big-data architecture for real-time analyticsA big-data architecture for real-time analytics
A big-data architecture for real-time analyticsramikaurraminder
 
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service Stefan Schwarz
 
Big Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsBig Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsKamalika Dutta
 
Architecture for Real-Time and Batch Big Data Analytics
Architecture for Real-Time and Batch Big Data AnalyticsArchitecture for Real-Time and Batch Big Data Analytics
Architecture for Real-Time and Batch Big Data AnalyticsNir Rubinstein
 
Agile data science
Agile data scienceAgile data science
Agile data scienceJoel Horwitz
 
A technical Introduction to Big Data Analytics
A technical Introduction to Big Data AnalyticsA technical Introduction to Big Data Analytics
A technical Introduction to Big Data AnalyticsPethuru Raj PhD
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachSoftServe
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks
 
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...Thoughtworks
 
Building Big Data Analytics Center Of Excellence
Building Big Data Analytics Center Of Excellence Building Big Data Analytics Center Of Excellence
Building Big Data Analytics Center Of Excellence Dr. Mohan K. Bavirisetty
 
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...Hortonworks
 
Business Process Maturity and Centers of Excellence
Business Process Maturity and Centers of ExcellenceBusiness Process Maturity and Centers of Excellence
Business Process Maturity and Centers of ExcellenceSandy Kemsley
 

Destaque (20)

DataAnalysis_Yan_BookReviewCropSci2014
DataAnalysis_Yan_BookReviewCropSci2014DataAnalysis_Yan_BookReviewCropSci2014
DataAnalysis_Yan_BookReviewCropSci2014
 
TDWI Solution Summit San Diego 2014 Advanced Analytics at Macys.com
TDWI Solution Summit San Diego 2014 Advanced Analytics at Macys.comTDWI Solution Summit San Diego 2014 Advanced Analytics at Macys.com
TDWI Solution Summit San Diego 2014 Advanced Analytics at Macys.com
 
Big data and the transport societal challenge - Maxime Flament
Big data and the transport societal challenge - Maxime FlamentBig data and the transport societal challenge - Maxime Flament
Big data and the transport societal challenge - Maxime Flament
 
Rapid JCR applications development with Sling
Rapid JCR applications development with SlingRapid JCR applications development with Sling
Rapid JCR applications development with Sling
 
Using MapReduce for Large–scale Medical Image Analysis
Using MapReduce for Large–scale Medical Image AnalysisUsing MapReduce for Large–scale Medical Image Analysis
Using MapReduce for Large–scale Medical Image Analysis
 
Ast 0060878 wayne-eckerson_research_report_big_data_analytics
Ast 0060878 wayne-eckerson_research_report_big_data_analyticsAst 0060878 wayne-eckerson_research_report_big_data_analytics
Ast 0060878 wayne-eckerson_research_report_big_data_analytics
 
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
Innovation Diffusion: a (Big) Data-driven approach to the study of the geogra...
 
Big Data Analytics: Architectural Perspective
Big Data Analytics: Architectural PerspectiveBig Data Analytics: Architectural Perspective
Big Data Analytics: Architectural Perspective
 
A big-data architecture for real-time analytics
A big-data architecture for real-time analyticsA big-data architecture for real-time analytics
A big-data architecture for real-time analytics
 
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
PARTNERS 2013 - Dr. Stefan Schwarz - Big Data Analytics as a Service
 
Big Data Analytics for Real Time Systems
Big Data Analytics for Real Time SystemsBig Data Analytics for Real Time Systems
Big Data Analytics for Real Time Systems
 
Architecture for Real-Time and Batch Big Data Analytics
Architecture for Real-Time and Batch Big Data AnalyticsArchitecture for Real-Time and Batch Big Data Analytics
Architecture for Real-Time and Batch Big Data Analytics
 
Agile data science
Agile data scienceAgile data science
Agile data science
 
A technical Introduction to Big Data Analytics
A technical Introduction to Big Data AnalyticsA technical Introduction to Big Data Analytics
A technical Introduction to Big Data Analytics
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric Approach
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
 
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
 
Building Big Data Analytics Center Of Excellence
Building Big Data Analytics Center Of Excellence Building Big Data Analytics Center Of Excellence
Building Big Data Analytics Center Of Excellence
 
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
 
Business Process Maturity and Centers of Excellence
Business Process Maturity and Centers of ExcellenceBusiness Process Maturity and Centers of Excellence
Business Process Maturity and Centers of Excellence
 

Semelhante a Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data Analytics

Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataHortonworks
 
IoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJIoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJDaniel Madrigal
 
Enterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionEnterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionHortonworks
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopPOSSCON
 
Hortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with HadoopHortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with HadoopMats Johansson
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks
 
Hortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks
 
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopHortonworks
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Hortonworks
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataWANdisco Plc
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...Hortonworks
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Hortonworks
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalHortonworks
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalHortonworks
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Hortonworks
 
Ben Marden - Making sense of Big Data
Ben Marden - Making sense of Big Data Ben Marden - Making sense of Big Data
Ben Marden - Making sense of Big Data WeAreEsynergy
 

Semelhante a Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data Analytics (20)

Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
 
IoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJIoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJ
 
Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks
 
Enterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionEnterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the Union
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with HadoopHortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with Hadoop
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
 
Hortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks and HP Vertica Webinar
Hortonworks and HP Vertica Webinar
 
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014
 
Ben Marden - Making sense of Big Data
Ben Marden - Making sense of Big Data Ben Marden - Making sense of Big Data
Ben Marden - Making sense of Big Data
 

Mais de Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

Mais de Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Último

openEuler Community Overview - a presentation showing the current scale
openEuler Community Overview - a presentation showing the current scaleopenEuler Community Overview - a presentation showing the current scale
openEuler Community Overview - a presentation showing the current scaleShane Coughlan
 
Flutter the Future of Mobile App Development - 5 Crucial Reasons.pdf
Flutter the Future of Mobile App Development - 5 Crucial Reasons.pdfFlutter the Future of Mobile App Development - 5 Crucial Reasons.pdf
Flutter the Future of Mobile App Development - 5 Crucial Reasons.pdfMind IT Systems
 
VuNet software organisation powerpoint deck
VuNet software organisation powerpoint deckVuNet software organisation powerpoint deck
VuNet software organisation powerpoint deckNaval Singh
 
Technical improvements. Reasons. Methods. Estimations. CJ
Technical improvements.  Reasons. Methods. Estimations. CJTechnical improvements.  Reasons. Methods. Estimations. CJ
Technical improvements. Reasons. Methods. Estimations. CJpolinaucc
 
Leveling Up your Branding and Mastering MERN: Fullstack WebDev
Leveling Up your Branding and Mastering MERN: Fullstack WebDevLeveling Up your Branding and Mastering MERN: Fullstack WebDev
Leveling Up your Branding and Mastering MERN: Fullstack WebDevpmgdscunsri
 
Enterprise Content Managements Solutions
Enterprise Content Managements SolutionsEnterprise Content Managements Solutions
Enterprise Content Managements SolutionsIQBG inc
 
Mobile App Development company Houston
Mobile  App  Development  company HoustonMobile  App  Development  company Houston
Mobile App Development company Houstonjennysmithusa549
 
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...Splashtop Inc
 
8 Steps to Build a LangChain RAG Chatbot.
8 Steps to Build a LangChain RAG Chatbot.8 Steps to Build a LangChain RAG Chatbot.
8 Steps to Build a LangChain RAG Chatbot.Ritesh Kanjee
 
Steps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic DevelopersSteps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic Developersmichealwillson701
 
Mobile App Development process | Expert Tips
Mobile App Development process | Expert TipsMobile App Development process | Expert Tips
Mobile App Development process | Expert Tipsmichealwillson701
 
Boost Efficiency: Sabre API Integration Made Easy
Boost Efficiency: Sabre API Integration Made EasyBoost Efficiency: Sabre API Integration Made Easy
Boost Efficiency: Sabre API Integration Made Easymichealwillson701
 
renewable energy renewable energy renewable energy renewable energy
renewable energy renewable energy renewable energy  renewable energyrenewable energy renewable energy renewable energy  renewable energy
renewable energy renewable energy renewable energy renewable energyjeyasrig
 
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityLarge Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityRandy Shoup
 
Telebu Social -Whatsapp Business API : Mastering Omnichannel Business Communi...
Telebu Social -Whatsapp Business API : Mastering Omnichannel Business Communi...Telebu Social -Whatsapp Business API : Mastering Omnichannel Business Communi...
Telebu Social -Whatsapp Business API : Mastering Omnichannel Business Communi...telebusocialmarketin
 
CYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptx
CYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptxCYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptx
CYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptxBarakaMuyengi
 
8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdf8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdfOffsiteNOC
 
Practical Advice for FDA’s 510(k) Requirements.pdf
Practical Advice for FDA’s 510(k) Requirements.pdfPractical Advice for FDA’s 510(k) Requirements.pdf
Practical Advice for FDA’s 510(k) Requirements.pdfICS
 
User Experience Designer | Kaylee Miller Resume
User Experience Designer | Kaylee Miller ResumeUser Experience Designer | Kaylee Miller Resume
User Experience Designer | Kaylee Miller ResumeKaylee Miller
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsconfluent
 

Último (20)

openEuler Community Overview - a presentation showing the current scale
openEuler Community Overview - a presentation showing the current scaleopenEuler Community Overview - a presentation showing the current scale
openEuler Community Overview - a presentation showing the current scale
 
Flutter the Future of Mobile App Development - 5 Crucial Reasons.pdf
Flutter the Future of Mobile App Development - 5 Crucial Reasons.pdfFlutter the Future of Mobile App Development - 5 Crucial Reasons.pdf
Flutter the Future of Mobile App Development - 5 Crucial Reasons.pdf
 
VuNet software organisation powerpoint deck
VuNet software organisation powerpoint deckVuNet software organisation powerpoint deck
VuNet software organisation powerpoint deck
 
Technical improvements. Reasons. Methods. Estimations. CJ
Technical improvements.  Reasons. Methods. Estimations. CJTechnical improvements.  Reasons. Methods. Estimations. CJ
Technical improvements. Reasons. Methods. Estimations. CJ
 
Leveling Up your Branding and Mastering MERN: Fullstack WebDev
Leveling Up your Branding and Mastering MERN: Fullstack WebDevLeveling Up your Branding and Mastering MERN: Fullstack WebDev
Leveling Up your Branding and Mastering MERN: Fullstack WebDev
 
Enterprise Content Managements Solutions
Enterprise Content Managements SolutionsEnterprise Content Managements Solutions
Enterprise Content Managements Solutions
 
Mobile App Development company Houston
Mobile  App  Development  company HoustonMobile  App  Development  company Houston
Mobile App Development company Houston
 
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
 
8 Steps to Build a LangChain RAG Chatbot.
8 Steps to Build a LangChain RAG Chatbot.8 Steps to Build a LangChain RAG Chatbot.
8 Steps to Build a LangChain RAG Chatbot.
 
Steps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic DevelopersSteps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic Developers
 
Mobile App Development process | Expert Tips
Mobile App Development process | Expert TipsMobile App Development process | Expert Tips
Mobile App Development process | Expert Tips
 
Boost Efficiency: Sabre API Integration Made Easy
Boost Efficiency: Sabre API Integration Made EasyBoost Efficiency: Sabre API Integration Made Easy
Boost Efficiency: Sabre API Integration Made Easy
 
renewable energy renewable energy renewable energy renewable energy
renewable energy renewable energy renewable energy  renewable energyrenewable energy renewable energy renewable energy  renewable energy
renewable energy renewable energy renewable energy renewable energy
 
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityLarge Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
 
Telebu Social -Whatsapp Business API : Mastering Omnichannel Business Communi...
Telebu Social -Whatsapp Business API : Mastering Omnichannel Business Communi...Telebu Social -Whatsapp Business API : Mastering Omnichannel Business Communi...
Telebu Social -Whatsapp Business API : Mastering Omnichannel Business Communi...
 
CYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptx
CYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptxCYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptx
CYBER SECURITY AND CYBER CRIME COMPLETE GUIDE.pLptx
 
8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdf8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdf
 
Practical Advice for FDA’s 510(k) Requirements.pdf
Practical Advice for FDA’s 510(k) Requirements.pdfPractical Advice for FDA’s 510(k) Requirements.pdf
Practical Advice for FDA’s 510(k) Requirements.pdf
 
User Experience Designer | Kaylee Miller Resume
User Experience Designer | Kaylee Miller ResumeUser Experience Designer | Kaylee Miller Resume
User Experience Designer | Kaylee Miller Resume
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 

Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data Analytics

  • 1. Page 1 Hortonworks © 2014 Distilling Hadoop Patterns of Use Shaun Connolly, Hortonworks @shaunconnolly March 25, 2014
  • 2. Page 2 Hortonworks © 2014 Our Mission: Our Commitment Open Leadership Drive innovation in the open exclusively via the Apache community-driven open source process Enterprise Rigor Engineer, test and certify Apache Hadoop with the enterprise in mind Ecosystem Endorsement Focus on deep integration with existing data center technologies and skills Headquarters: Palo Alto, CA Employees: 300+ and growing Reseller Partners Enable your Modern Data Architecture by Delivering Enterprise Apache Hadoop
  • 3. Page 3 Hortonworks © 2014 Data Continues to Grow Sharply 2020:   Digital  universe  =  40  Ze'abytes     2012:   Digital  universe  =  20  Ze'abytes   1  Ze2abyte  (ZB)  =  1  billion  Terabytes  (TB)     2014:   31%  of  enterprises  managing  more  than  1  Petabyte   Social   Networks   Machine   Generated   Documents,     Emails   OLTP,  ERP,     CRM  Systems   Geoloca@on   Data   Sensor   Data   Web  Logs,   Click  Streams   85%  of  growth  from  new  types  of   data  with  machine-­‐generated   data  increasing  15x   Sources:  IDC  and  IDG  Enterprise  
  • 4. Page 4 Hortonworks © 2014 Cameras and microphones widely deployed New routes to market via intelligent objects Content and services via connected products Everything has a URL Remote sensing of objects and environment Augmented reality Situational decision support Building and infrastructure management Over 50% of Internet connections are things: 2011: 15+ billion permanent, 50+ billion intermittent 2020: 30+ billion permanent, >200 billion intermittent Source: Gartner Keynote at Hadoop Summit 2013
  • 5. Page 5 Hortonworks © 2014 Harnessing Big Data is transformational to business models Enables the move from post-transaction, reactive analysis of subsets of data stored in silos to a world of pre-transaction, interactive insights across all data that impacts both the top and bottom lines
  • 6. Page 6 Hortonworks © 2014 DATA  SYSTEMS  APPLICATIONS   Repositories   ROOMS Sta@s@cal   Analysis   BI  /  Repor@ng,   Ad  Hoc  Analysis   Interac@ve  Web   &  Mobile  Applica@ons   Enterprise   Applica@ons   EDW MPPRDBMS   EDW   MPP   Governance     &  Integra=on   Security   Opera=ons   Data  Access   Data  Management   SOURCES   OLTP,  ERP,   CRM  Systems   Documents,     Emails   Web  Logs,   Click  Streams   Social   Networks   Machine   Generated   Sensor   Data   Geoloca@on   Data   Modern Data Architecture with Hadoop OPERATIONS  TOOLS   Provision, Manage & Monitor DEV  &  DATA  TOOLS   Build & Test ENTERPRISE HADOOP
  • 7. Page 7 Hortonworks © 2014 MDA Unlocks New Approach to Insight Enterprise  Hadoop   Mul@ple  Query  Engines   Itera@ve  Process:  Explore,  Transform,  Analyze   SQL   Single  Query  Engine   Repeatable  Linear  Process   Determine   list  of   ques@ons   Current  Approach     Apply  schema  on  write     Dependent  on  IT   Augment  with  Hadoop     Apply  schema  on  read     Support  range  of  access  paRerns  to  data  stored  in  HDFS   Design   solu@ons   Collect   structured   data   Ask   ques@ons   from  list   Detect   addi@onal   ques@ons   Batch   Interac@ve   Real-­‐@me   Streaming  
  • 8. Page 8 Hortonworks © 2014 Schema-on-Write vs. Schema-on-Read Standard Digital Camera § Zoom & focus first § Capture limited set of pixels § Crop around the focused area Lytro Lightfield Camera § Capture entire lightfield § Infinite zoom & focus § Crop any captured areas
  • 9. Page 9 Hortonworks © 2014 MDA Uses Commodity Compute + Storage $0 $20,000 $40,000 $60,000 $80,000 $180,000 Cloud Storage HADOOP NAS Engineered System Hadoop Enables Scalable Compute & Storage at a Compelling Cost Structure Fully Loaded Cost per Raw TB of Data (min – max cost) EDW/MPP SAN
  • 10. Page 10 Hortonworks © 2014 MDA Optimizes Data Warehouse Analytics 20% ETL Process 30% Operations 50% Current Reality §  EDW at capacity; some usage from low value workloads §  Older transformed data archived, unavailable for ongoing exploration §  Source data often discarded Operations 50% Analytics 50% HADOOP Parse, cleanse, apply structure, transform Augment with Hadoop §  Free up EDW resources from low value tasks §  Keep 100% of source data and historical data for ongoing exploration §  Mine data for value after loading it because of schema-on-read
  • 11. Page 11 Hortonworks © 2014 Integrating with Existing InvestmentsAPPLICATIONS  DATA  SYSTEM  SOURCES   RDBMS   EDW   MPP   Emerging  Sources     (Sensor,  Sen=ment,  Geo,  Unstructured)   HANA BusinessObjects BI OPERATIONAL  TOOLS   DEV  &  DATA  TOOLS   Exis=ng  Sources     (CRM,  ERP,  Clickstream,  Logs)   INFRASTRUCTURE  
  • 12. Page 12 Hortonworks © 2014 Powering the Modern Data Architecture     Enables  deep   insight  across  a   large,  broad,   diverse  set  of  data   at  efficient  scale     Mul=-­‐Use  Data  PlaSorm   Store  all  data  in  one  place,  process  in  many  ways   1   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   n   Batch   Interac=ve   Real-­‐=me   Streaming   Data Lake that contains ALL data; raw sources and any processed data over extended periods of time. YARN  :  Data  Opera=ng  System  
  • 13. Page 13 Hortonworks © 2014 How  Hadoop?     “Hadoop  can  be  used  to  create  a  ‘data  lake’  –  an  integrated   repository  of  data  from  internal  and  external  data  sources...   Data  combined  from  mulVple  silos  can  help  your  organizaVon   find  answers  to  complex  quesVons  that  no  one  has  previously   dared  ask  or  known  how  to  ask.”        -­‐-­‐  Forrester  
  • 14. Page 14 Hortonworks © 2014 The Common Journey with Hadoop SCALE SCOPE More data and analytic apps New Analytic Apps New types of data LOB-driven A Modern Data Architecture   RDBMS MPP EDW Governance &Integration Security Operations Data Access Data Management
  • 15. Page 15 Hortonworks © 2014 Unlock Value in New Types of Data 1.  Social Understand how people are feeling and interacting – right now 2.  Clickstream Capture and analyze website visitors’ data trails and optimize your website 3.  Sensor/Machine Discover patterns in data streaming from remote sensors and machines 4.  Geographic Analyze location-based data to manage operations where they occur 5.  Server Logs Diagnose process failures and prevent security breaches 6.  Unstructured (txt, video, pictures, etc..) Understand patterns in files across millions of web pages, emails, and documents Value + Online archive Data that was once purged or moved to tape can be stored in Hadoop to discover long term trends and previously hidden value
  • 16. Page 16 Hortonworks © 2014 20 Business Applications of Hadoop Industry Use Case Type of Data Financial Services New Account Risk Screens Text, Server Logs Trading Risk Server Logs Insurance Underwriting Geographic, Sensor, Text Telecom Call Detail Records (CDRs) Machine, Geographic Infrastructure Investment Machine, Server Logs Real-time Bandwidth Allocation Server Logs, Text, Social Retail 360° View of the Customer Clickstream, Text Localized, Personalized Promotions Geographic Website Optimization Clickstream Manufacturing Supply Chain and Logistics Sensor Assembly Line Quality Assurance Sensor Crowdsourced Quality Assurance Social Healthcare Use Genomic Data in Medical Trials Structured Monitor Patient Vitals in Real-Time Sensor Pharmaceuticals Recruit and Retain Patients for Drug Trials Social, Clickstream Improve Prescription Adherence Social, Unstructured, Geographic Oil & Gas Unify Exploration & Production Data Sensor, Geographic & Unstructured Monitor Rig Safety in Real-Time Sensor, Unstructured Government ETL Offload in Response to Federal Budgetary Pressures Structured Sentiment Analysis for Government Programs Social
  • 17. Page 17 Hortonworks © 2014 360° Customer View for Home Supply Retailer Problem Disjoint customer engagement across all channels Data repositories on website traffic, POS transactions and in- home services exist in separate silos Unable to perform analytics on customer buying behavior across all channels Limited ability for targeted marketing to specific segments Solution Unified system of engagement via “golden record” Golden record enables targeted marketing capabilities: customized coupons, promotions and emails Deep visibility into all customers and all market segments Unlocks rich, informed cross-sell & up-sell opportunities Creating Opportunity Data: Clickstream, Unstructured, Structured Retail Major home improvement retailer >$74B in revenue >300K employees >2,200 stores
  • 18. Page 18 Hortonworks © 2014 Monetize Anonymous & Aggregate Banking Data Problem Unable to unlock valuable cross-sell banking data Bank possesses data that indicates larger macro-economic trends, which can be monetized in secondary markets Data sets are isolated in legacy silos controlled by LOBs Regulations and company policies protect customer privacy IT challenged by joining data while guaranteeing anonymity Solution Create cross-LOB data lake of de-identified data Mortgage bankers, consumer bankers, credit card group and treasury bankers have access to the same cross-sell data Single point of security & privacy for de-identification, masking, encryption, authentication and access control Interoperability with SAS, Red Hat & Splunk Creating Opportunity Data: Structured, Clickstream, Social & Unstructured Banking One of the largest US banks
  • 19. Page 19 Hortonworks © 2014 Improving Efficiency Data: SensorOptimize High-Tech Manufacturing Problem Ineffective root cause analysis on product defects 200 million digital storage devices manufactured yearly >10K faulty devices returned by customers every month Limited data available for root cause analysis means that diagnosing problems is highly manual (physical inspections) Subset of sensor data from QA testing retained 3-12 months Solution Created sensor data lake for 10x quality improvement Repository holds 24 months of data for each device Manufacturing dashboard allows >1,000 employees to search data, with results returned in less than 1 second Quality improved 10x: rate down to ~1K faulty devices / month Manufacturing Digital Storage Devices >$15B in revenue >85K employees
  • 20. Page 20 Hortonworks © 2014 Think Pigabyte, Not Petabyte
  • 21. Page 21 Hortonworks © 2014 Enabling Hadoop for the Enterprise Journey Capabili=es   Ensure  enterprise  capabili@es   are  delivered  in  100%  open   source  to  benefit  all   1 2Integra=on   Interoperable  with  exis@ng     data  center  investments   Skills   Leverage  your  exis@ng  skills:   development,  analy@cs,   opera@ons    3 Scale Scope More data and analytic apps New Analytic Apps New types of data LOB-driven A Modern Data Architecture   RDBMS MPP EDW Governance &Integration Security Operations Data Access Data Management
  • 22. Page 22 Hortonworks © 2014 Try Hadoop Today… Get Involved Download the Hortonworks Sandbox Learn Hadoop Build Your Analytic App Try Hadoop 2 San Jose, CA June 3 - 5, 2014 REGISTER NOW Amsterdam April 2 - 3, 2014 REGISTER NOW
  • 23. Page 23 Hortonworks © 2014 Questions? @shaunconnolly