SlideShare uma empresa Scribd logo
1 de 33
Baixar para ler offline
1
Ask	
  Bigger	
  Ques,ons	
  
with	
  Cloudera	
  
and	
  Apache	
  Hadoop	
  
Graham	
  Gear	
  
graham@cloudera.com	
  
JUNE	
  2013	
  
	
  
	
  
Data	
  Has	
  Changed	
  in	
  the	
  Last	
  30	
  Years	
  DATA	
  GROWTH	
  
END-­‐USER	
  
APPLICATIONS	
  
THE	
  INTERNET	
  
MOBILE	
  DEVICES	
  
SOPHISTICATED	
  
MACHINES	
  
STRUCTURED	
  DATA	
  –	
  10%	
  
1980	
   2012	
  
UNSTRUCTURED	
  DATA	
  –	
  90%	
  
Data	
  Management	
  Strategies	
  
Have	
  Stayed	
  the	
  Same	
  
	
  
•  Raw	
  data	
  on	
  SAN,	
  NAS	
  
and	
  tape	
  
	
  
•  Data	
  moved	
  from	
  
storage	
  to	
  compute	
  
	
  
•  Rela,onal	
  models	
  with	
  
predesigned	
  schemas	
  
Too	
  Much	
  Data,	
  Too	
  Many	
  Sources	
  
•  Can’t	
  ingest	
  fast	
  enough	
  
Too	
  Much	
  Data,	
  Too	
  Many	
  Sources	
  
$
!
$ $
$
•  Can’t	
  ingest	
  fast	
  enough	
  
	
  
•  Costs	
  too	
  much	
  to	
  store	
  
Too	
  Much	
  Data,	
  Too	
  Many	
  Sources	
  
1
2 3 4
5
•  Can’t	
  ingest	
  fast	
  enough	
  
	
  
•  Costs	
  too	
  much	
  to	
  store	
  
	
  
•  Exists	
  in	
  different	
  places	
  
Too	
  Much	
  Data,	
  Too	
  Many	
  Sources	
  
•  Can’t	
  ingest	
  fast	
  enough	
  
	
  
•  Costs	
  too	
  much	
  to	
  store	
  
	
  
•  Exists	
  in	
  different	
  places	
  
	
  
•  Archived	
  data	
  is	
  lost	
  
Can’t	
  Use	
  It	
  The	
  Way	
  You	
  Want	
  To	
  
•  Analysis	
  and	
  processing	
  
takes	
  too	
  long	
  
Can’t	
  Use	
  It	
  The	
  Way	
  You	
  Want	
  To	
  
1
2 3 4
5
•  Analysis	
  and	
  processing	
  
takes	
  too	
  long	
  
	
  
•  Data	
  exists	
  in	
  silos	
  
Can’t	
  Use	
  It	
  The	
  Way	
  You	
  Want	
  To	
  
? ? ?
•  Analysis	
  and	
  processing	
  
takes	
  too	
  long	
  
	
  
•  Data	
  exists	
  in	
  silos	
  
	
  
•  Can’t	
  ask	
  new	
  ques,ons	
  
Can’t	
  Use	
  It	
  The	
  Way	
  You	
  Want	
  To	
  
•  Analysis	
  and	
  processing	
  
takes	
  too	
  long	
  
	
  
•  Data	
  exists	
  in	
  silos	
  
	
  
•  Can’t	
  ask	
  new	
  ques,ons	
  
	
  
•  Can’t	
  analyze	
  
unstructured	
  data	
  
12
Transform	
  The	
  Way	
  You	
  Think	
  About	
  Data	
  
Cloudera	
  
Ask	
  Bigger	
  Ques,ons	
  
13	
  
When	
  customer	
  x	
  visits	
  my	
  store	
  what	
  
can	
  I	
  recommend	
  based	
  on	
  their	
  
recent	
  web	
  behavior	
  across	
  our	
  
various	
  brand	
  websites?	
  
What	
  is	
  the	
  best	
  loca,on	
  in	
  North	
  
America	
  to	
  efficiently	
  produce	
  both	
  
tomato	
  plants	
  and	
  corn?	
  
What	
  does	
  every	
  fraudulent	
  ac,vity	
  in	
  
the	
  last	
  2	
  years	
  have	
  in	
  common	
  that	
  
will	
  help	
  us	
  iden,fy	
  and	
  proac,vely	
  
prevent	
  the	
  next	
  incident?	
  
Are	
  hotel	
  room	
  sales	
  at	
  Christmas	
  
slow	
  because	
  of	
  inventory	
  or	
  
compe,,ve	
  pricing?	
  	
  
What	
  did	
  customer	
  x	
  view	
  
on	
  their	
  last	
  website	
  visit?	
  
	
  
`	
  
What	
  makes	
  tomato	
  plants	
  
more	
  frui[ul	
  than	
  others	
  ?	
  
	
  
What	
  incidents	
  of	
  fraud	
  did	
  
we	
  detect	
  last	
  year?	
  
	
  
What	
  search	
  terms	
  are	
  used	
  
most	
  oen	
  when	
  looking	
  for	
  
hotels	
  in	
  NYC?	
  
	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
 	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  SIMPLIFIED,	
  UNIFIED,	
  EFFICIENT	
  
•	
  Bulk	
  of	
  data	
  stored	
  on	
  scalable	
  low	
  cost	
  pla[orm	
  
•	
  Perform	
  end-­‐to-­‐end	
  workflows	
  
•	
  Specialized	
  systems	
  reserved	
  for	
  specialized	
  workloads	
  
•	
  Provides	
  data	
  access	
  across	
  departments	
  or	
  LOB	
  
	
  	
  	
  COMPLEX,	
  FRAGMENTED,	
  COSTLY	
  
•Data	
  silos	
  by	
  department	
  or	
  LOB	
  
•	
  Lots	
  of	
  data	
  stored	
  in	
  expensive	
  specialized	
  systems	
  	
  
•	
  Analysts	
  pull	
  select	
  data	
  into	
  EDW	
  
•	
  No	
  one	
  has	
  a	
  complete	
  view	
  
	
  
The	
  Cloudera	
  Approach	
  
14	
  
Meet	
  enterprise	
  demands	
  with	
  a	
  new	
  way	
  to	
  think	
  about	
  data.	
  
THE	
  CLOUDERA	
  WAY	
  THE	
  OLD	
  WAY	
  
Single	
  data	
  pla[orm	
  to	
  
support	
  BI,	
  Repor,ng	
  &	
  	
  
App	
  Serving	
  
Mul,ple	
  pla[orms	
  	
  
for	
  mul,ple	
  workloads	
  
 	
  
INGEST	
   STORE	
   EXPLORE	
   PROCESS	
   ANALYZE	
   SERVE	
  
CDH	
   CLOUDERA	
  
MANAGER	
  
CLOUDERA	
  
SUPPORT	
  
Cloudera	
  Enterprise:	
  The	
  Pla[orm	
  for	
  Big	
  Data	
  
15	
  
BRINGS	
  STORAGE	
  &	
  
COMPUTE	
  TOGETHER	
  
WORKS	
  WITH	
  EVERY	
  
TYPE	
  OF	
  DATA	
  
CHANGES	
  THE	
  
ECONOMICS	
  OF	
  DATA	
  
MANGAGEMENT	
  
A	
  Revolu,onary	
  Solu,on	
  Built	
  on	
  Apache	
  Hadoop	
  
CLOUDERA	
  
NAVIGATOR	
  
16	
  
Cloudera	
  Enterprise	
  
Includes	
  Advanced	
  System	
  Management	
  &	
  Support	
  for	
  the	
  Core	
  CDH	
  Projects	
  
	
  	
  
CDH	
  
100%	
  OPEN	
  SOURCE	
  
HADOOP	
  DISTRIBUTION	
  
CLOUDERA	
  MANAGER	
  
END-­‐TO-­‐END	
  SYSTEM	
  MANAGEMENT	
  
CORE	
  PROJECTS	
   PREMIUM	
  PROJECTS	
   CONNECTORS	
  
HDFS	
   MAPREDUCE	
   FLUME	
   HCATALOG	
  
MICROSTRATEGY	
  
NETEZZA	
  
ORACLE	
  
QLIKVIEW	
  
TABLEAU	
  
TERADATA	
  
HIVE	
   HUE	
   MAHOUT	
   OOZIE	
  
PIG	
   SQOOP	
   WHIRR	
   ZOOKEEPER	
  
HBASE	
  
IMPALA	
  
SEARCH	
  (BETA)	
  
DEPLOYMENT	
   MONITORING	
   API	
   SNMP	
   CONFIG	
  ROLLBACKS	
   PHONE	
  HOME	
  
SERVICE	
  MGMT	
   DIAGNOSTICS	
   ROLLING	
  UPGRADES	
   LDAP	
   REPORTING	
   BACKUP/DR	
  
CLOUDERA	
  SUPPORT	
  
BEST-­‐IN-­‐CLASS	
  TECHNICAL	
  SUPPORT,	
  
COMMUNICTY	
  ADVOCACY	
  &	
  
INDEMNIFICATION	
  
CLOUDERA	
  NAVIGATOR	
  
END-­‐TO-­‐END	
  DATA	
  MANAGEMENT	
  
ACCESS	
  MGMT	
   DATA	
  AUDIT	
  
CORE	
  HADOOP	
  
PROJECTS	
  
CLOUDERA	
  
MANAGER	
  
CLOUDERA	
  
NAVIGATOR	
  
HBASE	
   IMPALA	
   Search	
  
17	
  
RTD	
  SubscripVon	
  
Includes	
  Support	
  &	
  Indemnity	
  for	
  Apache	
  HBase	
  
	
  	
  
CDH	
  
100%	
  OPEN	
  SOURCE	
  
HADOOP	
  DISTRIBUTION	
  
CLOUDERA	
  MANAGER	
  
END-­‐TO-­‐END	
  SYSTEM	
  MANAGEMENT	
  
CORE	
  PROJECTS	
   PREMIUM	
  PROJECTS	
   CONNECTORS	
  
HDFS	
   MAPREDUCE	
   FLUME	
   HCATALOG	
  
MICROSTRATEGY	
  
NETEZZA	
  
ORACLE	
  
QLIKVIEW	
  
TABLEAU	
  
TERADATA	
  
HIVE	
   HUE	
   MAHOUT	
   OOZIE	
  
PIG	
   SQOOP	
   WHIRR	
   ZOOKEEPER	
  
HBASE	
  
IMPALA	
  
SEARCH	
  (BETA)	
  
DEPLOYMENT	
   MONITORING	
   API	
   SNMP	
   CONFIG	
  ROLLBACKS	
   PHONE	
  HOME	
  
SERVICE	
  MGMT	
   DIAGNOSTICS	
   ROLLING	
  UPGRADES	
   LDAP	
   REPORTING	
   BACKUP/DR	
  
CLOUDERA	
  SUPPORT	
  
BEST-­‐IN-­‐CLASS	
  TECHNICAL	
  SUPPORT,	
  
COMMUNICTY	
  ADVOCACY	
  &	
  
INDEMNIFICATION	
  
CLOUDERA	
  NAVIGATOR	
  
END-­‐TO-­‐END	
  DATA	
  MANAGEMENT	
  
ACCESS	
  MGMT	
   DATA	
  AUDIT	
  
CORE	
  HADOOP	
  
PROJECTS	
  
CLOUDERA	
  
MANAGER	
  
CLOUDERA	
  
NAVIGATOR	
  
HBASE	
   IMPALA	
   Search	
  
18	
  
RTQ	
  SubscripVon	
  
Includes	
  Support	
  &	
  Indemnity	
  for	
  Cloudera	
  Impala	
  
	
  	
  
CDH	
  
100%	
  OPEN	
  SOURCE	
  
HADOOP	
  DISTRIBUTION	
  
CLOUDERA	
  MANAGER	
  
END-­‐TO-­‐END	
  SYSTEM	
  MANAGEMENT	
  
CORE	
  PROJECTS	
   PREMIUM	
  PROJECTS	
   CONNECTORS	
  
HDFS	
   MAPREDUCE	
   FLUME	
   HCATALOG	
  
MICROSTRATEGY	
  
NETEZZA	
  
ORACLE	
  
QLIKVIEW	
  
TABLEAU	
  
TERADATA	
  
HIVE	
   HUE	
   MAHOUT	
   OOZIE	
  
PIG	
   SQOOP	
   WHIRR	
   ZOOKEEPER	
  
HBASE	
  
IMPALA	
  
SEARCH	
  (BETA)	
  
DEPLOYMENT	
   MONITORING	
   API	
   SNMP	
   CONFIG	
  ROLLBACKS	
   PHONE	
  HOME	
  
SERVICE	
  MGMT	
   DIAGNOSTICS	
   ROLLING	
  UPGRADES	
   LDAP	
   REPORTING	
   BACKUP/DR	
  
CLOUDERA	
  SUPPORT	
  
BEST-­‐IN-­‐CLASS	
  TECHNICAL	
  SUPPORT,	
  
COMMUNICTY	
  ADVOCACY	
  &	
  
INDEMNIFICATION	
  
CLOUDERA	
  NAVIGATOR	
  
END-­‐TO-­‐END	
  DATA	
  MANAGEMENT	
  
ACCESS	
  MGMT	
   DATA	
  AUDIT	
  
CORE	
  HADOOP	
  
PROJECTS	
  
CLOUDERA	
  
MANAGER	
  
CLOUDERA	
  
NAVIGATOR	
  
HBASE	
   IMPALA	
   Search	
  
19	
  
RTS	
  SubscripVon	
  
Includes	
  Support	
  &	
  Indemnity	
  for	
  Cloudera	
  Search	
  
	
  	
  
CDH	
  
100%	
  OPEN	
  SOURCE	
  
HADOOP	
  DISTRIBUTION	
  
CLOUDERA	
  MANAGER	
  
END-­‐TO-­‐END	
  SYSTEM	
  MANAGEMENT	
  
CORE	
  PROJECTS	
   PREMIUM	
  PROJECTS	
   CONNECTORS	
  
HDFS	
   MAPREDUCE	
   FLUME	
   HCATALOG	
  
MICROSTRATEGY	
  
NETEZZA	
  
ORACLE	
  
QLIKVIEW	
  
TABLEAU	
  
TERADATA	
  
HIVE	
   HUE	
   MAHOUT	
   OOZIE	
  
PIG	
   SQOOP	
   WHIRR	
   ZOOKEEPER	
  
HBASE	
  
IMPALA	
  
SEARCH	
  (BETA)	
  
DEPLOYMENT	
   MONITORING	
   API	
   SNMP	
   CONFIG	
  ROLLBACKS	
   PHONE	
  HOME	
  
SERVICE	
  MGMT	
   DIAGNOSTICS	
   ROLLING	
  UPGRADES	
   LDAP	
   REPORTING	
   BACKUP/DR	
  
CLOUDERA	
  SUPPORT	
  
BEST-­‐IN-­‐CLASS	
  TECHNICAL	
  SUPPORT,	
  
COMMUNICTY	
  ADVOCACY	
  &	
  
INDEMNIFICATION	
  
CLOUDERA	
  NAVIGATOR	
  
END-­‐TO-­‐END	
  DATA	
  MANAGEMENT	
  
ACCESS	
  MGMT	
   DATA	
  AUDIT	
  
CORE	
  HADOOP	
  
PROJECTS	
  
CLOUDERA	
  
MANAGER	
  
CLOUDERA	
  
NAVIGATOR	
  
HBASE	
   Search	
  IMPALA	
  
20	
  
BDR	
  SubscripVon	
  
Includes	
  Centralized	
  Management	
  For	
  Disaster	
  Recovery	
  Workflows	
  
	
  	
  
CDH	
  
100%	
  OPEN	
  SOURCE	
  
HADOOP	
  DISTRIBUTION	
  
CLOUDERA	
  MANAGER	
  
END-­‐TO-­‐END	
  SYSTEM	
  MANAGEMENT	
  
CORE	
  PROJECTS	
   PREMIUM	
  PROJECTS	
   CONNECTORS	
  
HDFS	
   MAPREDUCE	
   FLUME	
   HCATALOG	
  
MICROSTRATEGY	
  
NETEZZA	
  
ORACLE	
  
QLIKVIEW	
  
TABLEAU	
  
TERADATA	
  
HIVE	
   HUE	
   MAHOUT	
   OOZIE	
  
PIG	
   SQOOP	
   WHIRR	
   ZOOKEEPER	
  
HBASE	
  
IMPALA	
  
SEARCH	
  (BETA)	
  
DEPLOYMENT	
   MONITORING	
   API	
   SNMP	
   CONFIG	
  ROLLBACKS	
   PHONE	
  HOME	
  
SERVICE	
  MGMT	
   DIAGNOSTICS	
   ROLLING	
  UPGRADES	
   LDAP	
   REPORTING	
   BACKUP/DR	
  
CLOUDERA	
  SUPPORT	
  
BEST-­‐IN-­‐CLASS	
  TECHNICAL	
  SUPPORT,	
  
COMMUNICTY	
  ADVOCACY	
  &	
  
INDEMNIFICATION	
  
CLOUDERA	
  NAVIGATOR	
  
END-­‐TO-­‐END	
  DATA	
  MANAGEMENT	
  
ACCESS	
  MGMT	
   DATA	
  AUDIT	
  
CORE	
  HADOOP	
  
PROJECTS	
  
CLOUDERA	
  
MANAGER	
  
CLOUDERA	
  
NAVIGATOR	
  
HBASE	
   IMPALA	
   Search	
  
21	
  
Navigator	
  SubscripVon	
  
Enables	
  Cloudera	
  Navigator	
  for	
  Automated	
  Data	
  Management	
  
	
  	
  
CDH	
  
100%	
  OPEN	
  SOURCE	
  
HADOOP	
  DISTRIBUTION	
  
CLOUDERA	
  MANAGER	
  
END-­‐TO-­‐END	
  SYSTEM	
  MANAGEMENT	
  
CORE	
  PROJECTS	
   PREMIUM	
  PROJECTS	
   CONNECTORS	
  
HDFS	
   MAPREDUCE	
   FLUME	
   HCATALOG	
  
MICROSTRATEGY	
  
NETEZZA	
  
ORACLE	
  
QLIKVIEW	
  
TABLEAU	
  
TERADATA	
  
HIVE	
   HUE	
   MAHOUT	
   OOZIE	
  
PIG	
   SQOOP	
   WHIRR	
   ZOOKEEPER	
  
HBASE	
  
IMPALA	
  
SEARCH	
  (BETA)	
  
DEPLOYMENT	
   MONITORING	
   API	
   SNMP	
   CONFIG	
  ROLLBACKS	
   PHONE	
  HOME	
  
SERVICE	
  MGMT	
   DIAGNOSTICS	
   ROLLING	
  UPGRADES	
   LDAP	
   REPORTING	
   BACKUP/DR	
  
CLOUDERA	
  SUPPORT	
  
BEST-­‐IN-­‐CLASS	
  TECHNICAL	
  SUPPORT,	
  
COMMUNICTY	
  ADVOCACY	
  &	
  
INDEMNIFICATION	
  
CLOUDERA	
  NAVIGATOR	
  
END-­‐TO-­‐END	
  DATA	
  MANAGEMENT	
  
ACCESS	
  MGMT	
   DATA	
  AUDIT	
  
CORE	
  HADOOP	
  
PROJECTS	
  
CLOUDERA	
  
MANAGER	
  
CLOUDERA	
  
NAVIGATOR	
  
HBASE	
   IMPALA	
   Search	
  
22
Customer	
  Case	
  Studies	
  
	
  
	
  
A	
  mul,na,onal	
  bank	
  saves	
  millions	
  by	
  
op,mizing	
  DW	
  for	
  analy,cs	
  &	
  reducing	
  data	
  
storage	
  costs	
  by	
  99%.	
  	
  
Ask	
  Bigger	
  Ques,ons:	
  
How	
  can	
  we	
  op,mize	
  our	
  
data	
  warehouse	
  investment?	
  
Cloudera	
  op,mizes	
  the	
  EDW,	
  saves	
  millions	
  
24	
  
The	
  Challenge:	
  
•  Teradata	
  EDW	
  at	
  capacity:	
  ETL	
  processes	
  consume	
  7	
  days;	
  takes	
  5	
  weeks	
  to	
  
make	
  historical	
  data	
  available	
  for	
  analysis	
  
•  Performance	
  issues	
  in	
  business	
  cri,cal	
  apps;	
  liqle	
  room	
  for	
  discovery,	
  analy,cs,	
  
ROI	
  from	
  opportuni,es	
  
Mul,na,onal	
  bank	
  saves	
  millions	
  by	
  
op,mizing	
  exis,ng	
  DW	
  for	
  analy,cs	
  &	
  
reducing	
  data	
  storage	
  costs	
  by	
  99%.	
  
The	
  Solu,on:	
  
•  Cloudera	
  Enterprise	
  offloads	
  data	
  
storage,	
  processing	
  &	
  some	
  
analy,cs	
  from	
  EDW	
  
•  Teradata	
  can	
  focus	
  on	
  opera,onal	
  
func,ons	
  &	
  analy,cs	
  
A	
  Semiconductor	
  Manufacturer	
  uses	
  	
  
predic,ve	
  analy,cs	
  to	
  take	
  preventa,ve	
  ac,on	
  
on	
  chips	
  likely	
  to	
  fail.	
  
Ask	
  Bigger	
  Ques,ons:	
  
Which	
  semiconductor	
  
chips	
  will	
  fail?	
  
Cloudera	
  enables	
  beqer	
  predic,ons	
  
26	
  
The	
  Challenge:	
  
•  Want	
  to	
  capture	
  greater	
  granular	
  and	
  historical	
  data	
  for	
  more	
  accurate	
  
predic,ve	
  yield	
  modeling	
  
•  Storing	
  9	
  months’	
  data	
  on	
  Oracle	
  is	
  expensive	
  	
  	
  
Semiconductor	
  manufacturer	
  can	
  
prevent	
  chip	
  failure	
  with	
  more	
  
accurate	
  predic,ve	
  yield	
  models.	
  
The	
  Solu,on:	
  
• Dell	
  |	
  Cloudera	
  solu,on	
  for	
  Apache	
  
Hadoop	
  
• 53	
  nodes;	
  plan	
  to	
  store	
  up	
  to	
  10	
  
years	
  (~10PB)	
  
• Capturing	
  &	
  processing	
  data	
  from	
  
each	
  phase	
  of	
  manufacturing	
  process	
  
CONFIDENTIAL	
  -­‐	
  RESTRICTED	
  
The	
  quant	
  risk	
  LOB	
  within	
  a	
  mul,na,onal	
  bank	
  
saves	
  millions	
  through	
  beqer	
  risk	
  exposure	
  
analysis	
  &	
  fraud	
  preven,on.	
  
Ask	
  Bigger	
  Ques,ons:	
  
How	
  can	
  we	
  prevent	
  
fraud?	
  
Cloudera	
  delivers	
  savings	
  through	
  fraud	
  preven,on	
  
28	
  
The	
  Challenge:	
  
•  Fraud	
  detec,on	
  is	
  a	
  cumbersome,	
  mul,-­‐step	
  analy,c	
  process	
  requiring	
  data	
  
sampling	
  
•  2B	
  transac,ons/month	
  necessitate	
  constant	
  revisions	
  to	
  risk	
  profiles	
  
•  Highly	
  tuned	
  100TB	
  Teradata	
  DW	
  drives	
  over-­‐budget	
  capital	
  reserves	
  &	
  lower	
  
investment	
  returns	
  
Quant	
  risk	
  LOB	
  in	
  mul,na,onal	
  bank	
  
saves	
  millions	
  through	
  beqer	
  risk	
  
exposure	
  analysis	
  &	
  fraud	
  preven,on	
  
The	
  Solu,on:	
  
•  Cloudera	
  Enterprise	
  data	
  factory	
  for	
  
fraud	
  preven,on,	
  credit	
  &	
  
opera,onal	
  risk	
  analysis	
  
•  Look	
  at	
  every	
  incidence	
  of	
  fraud	
  for	
  
5	
  years	
  for	
  each	
  person	
  
•  Reduced	
  costs;	
  expensive	
  CPU	
  no	
  
longer	
  consumed	
  by	
  data	
  processing	
  
BlackBerry	
  eliminates	
  data	
  sampling	
  &	
  
simplifies	
  data	
  processing	
  for	
  beqer,	
  more	
  
comprehensive	
  analysis.	
  
Ask	
  Bigger	
  Ques,ons:	
  
How	
  do	
  we	
  retain	
  customers	
  
in	
  a	
  compe,,ve	
  market?	
  
Cloudera	
  delivers	
  ROI	
  through	
  storage	
  alone	
  
30	
  
The	
  Challenge:	
  
•  BlackBerry	
  Services	
  generates	
  .5PB	
  (50-­‐60TB	
  compressed)	
  data	
  per	
  day	
  
•  RDBMS	
  is	
  expensive	
  –	
  limited	
  to	
  1%	
  data	
  sampling	
  for	
  analy,cs	
  
BlackBerry	
  can	
  analyze	
  all	
  their	
  data	
  
vs.	
  relying	
  on	
  1%	
  sample	
  for	
  beqer	
  
network	
  capacity	
  trending	
  &	
  
management.	
  
The	
  Solu,on:	
  
•  Cloudera	
  Enterprise	
  manages	
  global	
  
data	
  set	
  of	
  ~100PB	
  
•  Collec,ng	
  device	
  content,	
  machine-­‐
generated	
  log	
  data,	
  audit	
  details	
  
•  90%	
  ETL	
  code	
  base	
  reduc,on	
  
31
A	
  global	
  retailer’s	
  customers	
  benefit	
  from	
  
more	
  personalized	
  communica,ons	
  and	
  offers	
  
based	
  on	
  interac,ons	
  across	
  all	
  channels.	
  	
  
Ask	
  Bigger	
  Ques,ons:	
  
How	
  can	
  we	
  offer	
  customers	
  
the	
  best	
  experience?	
  
Cloudera	
  op,mizes	
  the	
  DW	
  for	
  improved	
  ROI	
  
32	
  
Global	
  retailer’s	
  customers	
  benefit	
  
from	
  more	
  personalized	
  
communica,ons	
  based	
  on	
  
interac,ons	
  across	
  all	
  channels.	
  
The	
  Solu,on:	
  
•  Cloudera	
  Enterprise	
  with	
  Impala	
  —	
  
1PB	
  over	
  250	
  nodes	
  
•  Consolidated	
  pla[orm	
  for	
  Big	
  Data	
  
with	
  single	
  environment	
  for	
  query	
  
and	
  machine	
  learning	
  
	
  
	
  
	
  	
  
	
  
CONFIDENTIAL	
  -­‐	
  RESTRICTED	
  
The	
  Challenge:	
  
• 	
  Need	
  to	
  correlate	
  online/offline	
  data	
  across	
  disparate,	
  costly	
  legacy	
  DWs	
  
• 	
  Data	
  takes	
  up	
  to	
  4	
  weeks	
  to	
  get	
  data	
  from	
  one	
  group	
  –	
  inhibits	
  produc,vity	
  	
  
33
Any	
  Ques,ons,	
  Big	
  or	
  Small?	
  
	
  
	
  

Mais conteúdo relacionado

Mais procurados

Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksHortonworks
 
Unlocking data science in the enterprise - with Oracle and Cloudera
Unlocking data science in the enterprise - with Oracle and ClouderaUnlocking data science in the enterprise - with Oracle and Cloudera
Unlocking data science in the enterprise - with Oracle and ClouderaCloudera, Inc.
 
SAP Sybase IQ Sunumu-Sybase Türkiye
SAP Sybase IQ Sunumu-Sybase TürkiyeSAP Sybase IQ Sunumu-Sybase Türkiye
SAP Sybase IQ Sunumu-Sybase TürkiyeSybase Türkiye
 
Top 5 Strategies for Retail Data Analytics
Top 5 Strategies for Retail Data AnalyticsTop 5 Strategies for Retail Data Analytics
Top 5 Strategies for Retail Data AnalyticsHortonworks
 
The Five Markers on Your Big Data Journey
The Five Markers on Your Big Data JourneyThe Five Markers on Your Big Data Journey
The Five Markers on Your Big Data JourneyCloudera, Inc.
 
Big Data Solutions Executive Overview
Big Data Solutions Executive OverviewBig Data Solutions Executive Overview
Big Data Solutions Executive OverviewRCG Global Services
 
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...Cloudera, Inc.
 
Actian forrester- hortonworks
Actian   forrester- hortonworksActian   forrester- hortonworks
Actian forrester- hortonworksHortonworks
 
The Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen ModernizationThe Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen ModernizationHortonworks
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersDataWorks Summit
 
The path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial ServicesThe path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial ServicesHortonworks
 
Meet the experts dwo bde vds v7
Meet the experts dwo bde vds v7Meet the experts dwo bde vds v7
Meet the experts dwo bde vds v7mmathipra
 
Oil and gas big data edition
Oil and gas  big data editionOil and gas  big data edition
Oil and gas big data editionMark Kerzner
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationInside Analysis
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseJeffrey T. Pollock
 
IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?Hortonworks
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownInside Analysis
 
Harnessing Hadoop Distuption: A Telco Case Study
Harnessing Hadoop Distuption: A Telco Case StudyHarnessing Hadoop Distuption: A Telco Case Study
Harnessing Hadoop Distuption: A Telco Case StudyDataWorks Summit
 

Mais procurados (20)

Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and Hortonworks
 
Unlocking data science in the enterprise - with Oracle and Cloudera
Unlocking data science in the enterprise - with Oracle and ClouderaUnlocking data science in the enterprise - with Oracle and Cloudera
Unlocking data science in the enterprise - with Oracle and Cloudera
 
SAP Sybase IQ Sunumu-Sybase Türkiye
SAP Sybase IQ Sunumu-Sybase TürkiyeSAP Sybase IQ Sunumu-Sybase Türkiye
SAP Sybase IQ Sunumu-Sybase Türkiye
 
Top 5 Strategies for Retail Data Analytics
Top 5 Strategies for Retail Data AnalyticsTop 5 Strategies for Retail Data Analytics
Top 5 Strategies for Retail Data Analytics
 
The Five Markers on Your Big Data Journey
The Five Markers on Your Big Data JourneyThe Five Markers on Your Big Data Journey
The Five Markers on Your Big Data Journey
 
Big Data Solutions Executive Overview
Big Data Solutions Executive OverviewBig Data Solutions Executive Overview
Big Data Solutions Executive Overview
 
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
 
Big Data Telecom
Big Data TelecomBig Data Telecom
Big Data Telecom
 
Actian forrester- hortonworks
Actian   forrester- hortonworksActian   forrester- hortonworks
Actian forrester- hortonworks
 
The Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen ModernizationThe Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen Modernization
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service Providers
 
The path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial ServicesThe path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial Services
 
Meet the experts dwo bde vds v7
Meet the experts dwo bde vds v7Meet the experts dwo bde vds v7
Meet the experts dwo bde vds v7
 
Oil and gas big data edition
Oil and gas  big data editionOil and gas  big data edition
Oil and gas big data edition
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter Integration
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San Jose
 
IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data Letdown
 
Harnessing Hadoop Distuption: A Telco Case Study
Harnessing Hadoop Distuption: A Telco Case StudyHarnessing Hadoop Distuption: A Telco Case Study
Harnessing Hadoop Distuption: A Telco Case Study
 
Ask bigger questions
Ask bigger questionsAsk bigger questions
Ask bigger questions
 

Semelhante a Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013

Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Vantara
 
Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...Cloudera, Inc.
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB
 
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...TheInevitableCloud
 
Cw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderaCw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderainevitablecloud
 
Hadoop Application Architectures - Fraud Detection
Hadoop Application Architectures - Fraud  DetectionHadoop Application Architectures - Fraud  Detection
Hadoop Application Architectures - Fraud Detectionhadooparchbook
 
Using Hadoop to Drive Down Fraud for Telcos
Using Hadoop to Drive Down Fraud for TelcosUsing Hadoop to Drive Down Fraud for Telcos
Using Hadoop to Drive Down Fraud for TelcosCloudera, Inc.
 
Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic IntelAPAC
 
Big Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast DataBig Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast DataMatt Stubbs
 
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyEnterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyInside Analysis
 
Hadoop Perspectives for 2017
Hadoop Perspectives for 2017Hadoop Perspectives for 2017
Hadoop Perspectives for 2017Precisely
 
Is Hadoop the Demise of Data Warehousing? The Impact of Hadoop/Big Data on BI...
Is Hadoop the Demise of Data Warehousing? The Impact of Hadoop/Big Data on BI...Is Hadoop the Demise of Data Warehousing? The Impact of Hadoop/Big Data on BI...
Is Hadoop the Demise of Data Warehousing? The Impact of Hadoop/Big Data on BI...Senturus
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsCloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineData Con LA
 
How Experian increased insights with Hadoop
How Experian increased insights with HadoopHow Experian increased insights with Hadoop
How Experian increased insights with HadoopPrecisely
 
Achieving Business Value by Fusing Hadoop and Corporate Data
Achieving Business Value by Fusing Hadoop and Corporate DataAchieving Business Value by Fusing Hadoop and Corporate Data
Achieving Business Value by Fusing Hadoop and Corporate DataInside Analysis
 
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Stefan Lipp
 

Semelhante a Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013 (20)

Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop Solution
 
Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
 
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
 
Cw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderaCw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-cloudera
 
Hadoop Application Architectures - Fraud Detection
Hadoop Application Architectures - Fraud  DetectionHadoop Application Architectures - Fraud  Detection
Hadoop Application Architectures - Fraud Detection
 
Using Hadoop to Drive Down Fraud for Telcos
Using Hadoop to Drive Down Fraud for TelcosUsing Hadoop to Drive Down Fraud for Telcos
Using Hadoop to Drive Down Fraud for Telcos
 
Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic
 
Big Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast DataBig Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast Data
 
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyEnterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
 
Hadoop Perspectives for 2017
Hadoop Perspectives for 2017Hadoop Perspectives for 2017
Hadoop Perspectives for 2017
 
Is Hadoop the Demise of Data Warehousing? The Impact of Hadoop/Big Data on BI...
Is Hadoop the Demise of Data Warehousing? The Impact of Hadoop/Big Data on BI...Is Hadoop the Demise of Data Warehousing? The Impact of Hadoop/Big Data on BI...
Is Hadoop the Demise of Data Warehousing? The Impact of Hadoop/Big Data on BI...
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice Hotels
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
 
How Experian increased insights with Hadoop
How Experian increased insights with HadoopHow Experian increased insights with Hadoop
How Experian increased insights with Hadoop
 
Achieving Business Value by Fusing Hadoop and Corporate Data
Achieving Business Value by Fusing Hadoop and Corporate DataAchieving Business Value by Fusing Hadoop and Corporate Data
Achieving Business Value by Fusing Hadoop and Corporate Data
 
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
 

Mais de Publicis Sapient Engineering

XebiCon'18 - L'algorithme de reconnaissance de formes par le cerveau humain
XebiCon'18 - L'algorithme de reconnaissance de formes par le cerveau humainXebiCon'18 - L'algorithme de reconnaissance de formes par le cerveau humain
XebiCon'18 - L'algorithme de reconnaissance de formes par le cerveau humainPublicis Sapient Engineering
 
Xebicon'18 - Spark in jail : conteneurisez vos traitements data sans serveur
Xebicon'18 - Spark in jail : conteneurisez vos traitements data sans serveurXebicon'18 - Spark in jail : conteneurisez vos traitements data sans serveur
Xebicon'18 - Spark in jail : conteneurisez vos traitements data sans serveurPublicis Sapient Engineering
 
XebiCon'18 - La Web App d'aujourd'hui et de demain : état de l'art et bleedin...
XebiCon'18 - La Web App d'aujourd'hui et de demain : état de l'art et bleedin...XebiCon'18 - La Web App d'aujourd'hui et de demain : état de l'art et bleedin...
XebiCon'18 - La Web App d'aujourd'hui et de demain : état de l'art et bleedin...Publicis Sapient Engineering
 
XebiCon'18 - Des notebook pour le monitoring avec Zeppelin
XebiCon'18 - Des notebook pour le monitoring avec Zeppelin XebiCon'18 - Des notebook pour le monitoring avec Zeppelin
XebiCon'18 - Des notebook pour le monitoring avec Zeppelin Publicis Sapient Engineering
 
XebiCon'18 - Event Sourcing et RGPD, incompatibles ?
XebiCon'18 - Event Sourcing et RGPD, incompatibles ?XebiCon'18 - Event Sourcing et RGPD, incompatibles ?
XebiCon'18 - Event Sourcing et RGPD, incompatibles ?Publicis Sapient Engineering
 
XebiCon'18 - Deno, le nouveau NodeJS qui inverse la tendance ?
XebiCon'18 - Deno, le nouveau NodeJS qui inverse la tendance ?XebiCon'18 - Deno, le nouveau NodeJS qui inverse la tendance ?
XebiCon'18 - Deno, le nouveau NodeJS qui inverse la tendance ?Publicis Sapient Engineering
 
XebiCon'18 - Boostez vos modèles avec du Deep Learning distribué
XebiCon'18 - Boostez vos modèles avec du Deep Learning distribuéXebiCon'18 - Boostez vos modèles avec du Deep Learning distribué
XebiCon'18 - Boostez vos modèles avec du Deep Learning distribuéPublicis Sapient Engineering
 
XebiCon'18 - Comment j'ai développé un jeu vidéo avec des outils de développe...
XebiCon'18 - Comment j'ai développé un jeu vidéo avec des outils de développe...XebiCon'18 - Comment j'ai développé un jeu vidéo avec des outils de développe...
XebiCon'18 - Comment j'ai développé un jeu vidéo avec des outils de développe...Publicis Sapient Engineering
 
XebiCon'18 - Les utilisateurs finaux, les oubliés de nos produits !
XebiCon'18 - Les utilisateurs finaux, les oubliés de nos produits !XebiCon'18 - Les utilisateurs finaux, les oubliés de nos produits !
XebiCon'18 - Les utilisateurs finaux, les oubliés de nos produits !Publicis Sapient Engineering
 
XebiCon'18 - Comment fausser l'interprétation de vos résultats avec des dataviz
XebiCon'18 - Comment fausser l'interprétation de vos résultats avec des datavizXebiCon'18 - Comment fausser l'interprétation de vos résultats avec des dataviz
XebiCon'18 - Comment fausser l'interprétation de vos résultats avec des datavizPublicis Sapient Engineering
 
XebiCon'18 - Architecturer son application mobile pour la durabilité
XebiCon'18 - Architecturer son application mobile pour la durabilitéXebiCon'18 - Architecturer son application mobile pour la durabilité
XebiCon'18 - Architecturer son application mobile pour la durabilitéPublicis Sapient Engineering
 
XebiCon'18 - Sécuriser son API avec OpenID Connect
XebiCon'18 - Sécuriser son API avec OpenID ConnectXebiCon'18 - Sécuriser son API avec OpenID Connect
XebiCon'18 - Sécuriser son API avec OpenID ConnectPublicis Sapient Engineering
 
XebiCon'18 - Structuration du Temps et Dynamique de Groupes, Théorie organisa...
XebiCon'18 - Structuration du Temps et Dynamique de Groupes, Théorie organisa...XebiCon'18 - Structuration du Temps et Dynamique de Groupes, Théorie organisa...
XebiCon'18 - Structuration du Temps et Dynamique de Groupes, Théorie organisa...Publicis Sapient Engineering
 
XebiCon'18 - La sécurité, douce illusion même en 2018
XebiCon'18 - La sécurité, douce illusion même en 2018XebiCon'18 - La sécurité, douce illusion même en 2018
XebiCon'18 - La sécurité, douce illusion même en 2018Publicis Sapient Engineering
 
XebiCon'18 - Utiliser Hyperledger Fabric pour la création d'une blockchain pr...
XebiCon'18 - Utiliser Hyperledger Fabric pour la création d'une blockchain pr...XebiCon'18 - Utiliser Hyperledger Fabric pour la création d'une blockchain pr...
XebiCon'18 - Utiliser Hyperledger Fabric pour la création d'une blockchain pr...Publicis Sapient Engineering
 
XebiCon'18 - Ce que l'histoire du métro Parisien m'a enseigné sur la création...
XebiCon'18 - Ce que l'histoire du métro Parisien m'a enseigné sur la création...XebiCon'18 - Ce que l'histoire du métro Parisien m'a enseigné sur la création...
XebiCon'18 - Ce que l'histoire du métro Parisien m'a enseigné sur la création...Publicis Sapient Engineering
 

Mais de Publicis Sapient Engineering (20)

XebiCon'18 - L'algorithme de reconnaissance de formes par le cerveau humain
XebiCon'18 - L'algorithme de reconnaissance de formes par le cerveau humainXebiCon'18 - L'algorithme de reconnaissance de formes par le cerveau humain
XebiCon'18 - L'algorithme de reconnaissance de formes par le cerveau humain
 
Xebicon'18 - IoT: From Edge to Cloud
Xebicon'18 - IoT: From Edge to CloudXebicon'18 - IoT: From Edge to Cloud
Xebicon'18 - IoT: From Edge to Cloud
 
Xebicon'18 - Spark in jail : conteneurisez vos traitements data sans serveur
Xebicon'18 - Spark in jail : conteneurisez vos traitements data sans serveurXebicon'18 - Spark in jail : conteneurisez vos traitements data sans serveur
Xebicon'18 - Spark in jail : conteneurisez vos traitements data sans serveur
 
XebiCon'18 - Modern Infrastructure
XebiCon'18 - Modern InfrastructureXebiCon'18 - Modern Infrastructure
XebiCon'18 - Modern Infrastructure
 
XebiCon'18 - La Web App d'aujourd'hui et de demain : état de l'art et bleedin...
XebiCon'18 - La Web App d'aujourd'hui et de demain : état de l'art et bleedin...XebiCon'18 - La Web App d'aujourd'hui et de demain : état de l'art et bleedin...
XebiCon'18 - La Web App d'aujourd'hui et de demain : état de l'art et bleedin...
 
XebiCon'18 - Des notebook pour le monitoring avec Zeppelin
XebiCon'18 - Des notebook pour le monitoring avec Zeppelin XebiCon'18 - Des notebook pour le monitoring avec Zeppelin
XebiCon'18 - Des notebook pour le monitoring avec Zeppelin
 
XebiCon'18 - Event Sourcing et RGPD, incompatibles ?
XebiCon'18 - Event Sourcing et RGPD, incompatibles ?XebiCon'18 - Event Sourcing et RGPD, incompatibles ?
XebiCon'18 - Event Sourcing et RGPD, incompatibles ?
 
XebiCon'18 - Deno, le nouveau NodeJS qui inverse la tendance ?
XebiCon'18 - Deno, le nouveau NodeJS qui inverse la tendance ?XebiCon'18 - Deno, le nouveau NodeJS qui inverse la tendance ?
XebiCon'18 - Deno, le nouveau NodeJS qui inverse la tendance ?
 
XebiCon'18 - Boostez vos modèles avec du Deep Learning distribué
XebiCon'18 - Boostez vos modèles avec du Deep Learning distribuéXebiCon'18 - Boostez vos modèles avec du Deep Learning distribué
XebiCon'18 - Boostez vos modèles avec du Deep Learning distribué
 
XebiCon'18 - Comment j'ai développé un jeu vidéo avec des outils de développe...
XebiCon'18 - Comment j'ai développé un jeu vidéo avec des outils de développe...XebiCon'18 - Comment j'ai développé un jeu vidéo avec des outils de développe...
XebiCon'18 - Comment j'ai développé un jeu vidéo avec des outils de développe...
 
XebiCon'18 - Les utilisateurs finaux, les oubliés de nos produits !
XebiCon'18 - Les utilisateurs finaux, les oubliés de nos produits !XebiCon'18 - Les utilisateurs finaux, les oubliés de nos produits !
XebiCon'18 - Les utilisateurs finaux, les oubliés de nos produits !
 
XebiCon'18 - Comment fausser l'interprétation de vos résultats avec des dataviz
XebiCon'18 - Comment fausser l'interprétation de vos résultats avec des datavizXebiCon'18 - Comment fausser l'interprétation de vos résultats avec des dataviz
XebiCon'18 - Comment fausser l'interprétation de vos résultats avec des dataviz
 
XebiCon'18 - Le développeur dans la Pop Culture
XebiCon'18 - Le développeur dans la Pop Culture XebiCon'18 - Le développeur dans la Pop Culture
XebiCon'18 - Le développeur dans la Pop Culture
 
XebiCon'18 - Architecturer son application mobile pour la durabilité
XebiCon'18 - Architecturer son application mobile pour la durabilitéXebiCon'18 - Architecturer son application mobile pour la durabilité
XebiCon'18 - Architecturer son application mobile pour la durabilité
 
XebiCon'18 - Sécuriser son API avec OpenID Connect
XebiCon'18 - Sécuriser son API avec OpenID ConnectXebiCon'18 - Sécuriser son API avec OpenID Connect
XebiCon'18 - Sécuriser son API avec OpenID Connect
 
XebiCon'18 - Structuration du Temps et Dynamique de Groupes, Théorie organisa...
XebiCon'18 - Structuration du Temps et Dynamique de Groupes, Théorie organisa...XebiCon'18 - Structuration du Temps et Dynamique de Groupes, Théorie organisa...
XebiCon'18 - Structuration du Temps et Dynamique de Groupes, Théorie organisa...
 
XebiCon'18 - Spark NLP, un an après
XebiCon'18 - Spark NLP, un an aprèsXebiCon'18 - Spark NLP, un an après
XebiCon'18 - Spark NLP, un an après
 
XebiCon'18 - La sécurité, douce illusion même en 2018
XebiCon'18 - La sécurité, douce illusion même en 2018XebiCon'18 - La sécurité, douce illusion même en 2018
XebiCon'18 - La sécurité, douce illusion même en 2018
 
XebiCon'18 - Utiliser Hyperledger Fabric pour la création d'une blockchain pr...
XebiCon'18 - Utiliser Hyperledger Fabric pour la création d'une blockchain pr...XebiCon'18 - Utiliser Hyperledger Fabric pour la création d'une blockchain pr...
XebiCon'18 - Utiliser Hyperledger Fabric pour la création d'une blockchain pr...
 
XebiCon'18 - Ce que l'histoire du métro Parisien m'a enseigné sur la création...
XebiCon'18 - Ce que l'histoire du métro Parisien m'a enseigné sur la création...XebiCon'18 - Ce que l'histoire du métro Parisien m'a enseigné sur la création...
XebiCon'18 - Ce que l'histoire du métro Parisien m'a enseigné sur la création...
 

Último

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Último (20)

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013

  • 1. 1 Ask  Bigger  Ques,ons   with  Cloudera   and  Apache  Hadoop   Graham  Gear   graham@cloudera.com   JUNE  2013      
  • 2. Data  Has  Changed  in  the  Last  30  Years  DATA  GROWTH   END-­‐USER   APPLICATIONS   THE  INTERNET   MOBILE  DEVICES   SOPHISTICATED   MACHINES   STRUCTURED  DATA  –  10%   1980   2012   UNSTRUCTURED  DATA  –  90%  
  • 3. Data  Management  Strategies   Have  Stayed  the  Same     •  Raw  data  on  SAN,  NAS   and  tape     •  Data  moved  from   storage  to  compute     •  Rela,onal  models  with   predesigned  schemas  
  • 4. Too  Much  Data,  Too  Many  Sources   •  Can’t  ingest  fast  enough  
  • 5. Too  Much  Data,  Too  Many  Sources   $ ! $ $ $ •  Can’t  ingest  fast  enough     •  Costs  too  much  to  store  
  • 6. Too  Much  Data,  Too  Many  Sources   1 2 3 4 5 •  Can’t  ingest  fast  enough     •  Costs  too  much  to  store     •  Exists  in  different  places  
  • 7. Too  Much  Data,  Too  Many  Sources   •  Can’t  ingest  fast  enough     •  Costs  too  much  to  store     •  Exists  in  different  places     •  Archived  data  is  lost  
  • 8. Can’t  Use  It  The  Way  You  Want  To   •  Analysis  and  processing   takes  too  long  
  • 9. Can’t  Use  It  The  Way  You  Want  To   1 2 3 4 5 •  Analysis  and  processing   takes  too  long     •  Data  exists  in  silos  
  • 10. Can’t  Use  It  The  Way  You  Want  To   ? ? ? •  Analysis  and  processing   takes  too  long     •  Data  exists  in  silos     •  Can’t  ask  new  ques,ons  
  • 11. Can’t  Use  It  The  Way  You  Want  To   •  Analysis  and  processing   takes  too  long     •  Data  exists  in  silos     •  Can’t  ask  new  ques,ons     •  Can’t  analyze   unstructured  data  
  • 12. 12 Transform  The  Way  You  Think  About  Data   Cloudera  
  • 13. Ask  Bigger  Ques,ons   13   When  customer  x  visits  my  store  what   can  I  recommend  based  on  their   recent  web  behavior  across  our   various  brand  websites?   What  is  the  best  loca,on  in  North   America  to  efficiently  produce  both   tomato  plants  and  corn?   What  does  every  fraudulent  ac,vity  in   the  last  2  years  have  in  common  that   will  help  us  iden,fy  and  proac,vely   prevent  the  next  incident?   Are  hotel  room  sales  at  Christmas   slow  because  of  inventory  or   compe,,ve  pricing?     What  did  customer  x  view   on  their  last  website  visit?     `   What  makes  tomato  plants   more  frui[ul  than  others  ?     What  incidents  of  fraud  did   we  detect  last  year?     What  search  terms  are  used   most  oen  when  looking  for   hotels  in  NYC?                                                                                                    
  • 14.                                SIMPLIFIED,  UNIFIED,  EFFICIENT   •  Bulk  of  data  stored  on  scalable  low  cost  pla[orm   •  Perform  end-­‐to-­‐end  workflows   •  Specialized  systems  reserved  for  specialized  workloads   •  Provides  data  access  across  departments  or  LOB        COMPLEX,  FRAGMENTED,  COSTLY   •Data  silos  by  department  or  LOB   •  Lots  of  data  stored  in  expensive  specialized  systems     •  Analysts  pull  select  data  into  EDW   •  No  one  has  a  complete  view     The  Cloudera  Approach   14   Meet  enterprise  demands  with  a  new  way  to  think  about  data.   THE  CLOUDERA  WAY  THE  OLD  WAY   Single  data  pla[orm  to   support  BI,  Repor,ng  &     App  Serving   Mul,ple  pla[orms     for  mul,ple  workloads  
  • 15.     INGEST   STORE   EXPLORE   PROCESS   ANALYZE   SERVE   CDH   CLOUDERA   MANAGER   CLOUDERA   SUPPORT   Cloudera  Enterprise:  The  Pla[orm  for  Big  Data   15   BRINGS  STORAGE  &   COMPUTE  TOGETHER   WORKS  WITH  EVERY   TYPE  OF  DATA   CHANGES  THE   ECONOMICS  OF  DATA   MANGAGEMENT   A  Revolu,onary  Solu,on  Built  on  Apache  Hadoop   CLOUDERA   NAVIGATOR  
  • 16. 16   Cloudera  Enterprise   Includes  Advanced  System  Management  &  Support  for  the  Core  CDH  Projects       CDH   100%  OPEN  SOURCE   HADOOP  DISTRIBUTION   CLOUDERA  MANAGER   END-­‐TO-­‐END  SYSTEM  MANAGEMENT   CORE  PROJECTS   PREMIUM  PROJECTS   CONNECTORS   HDFS   MAPREDUCE   FLUME   HCATALOG   MICROSTRATEGY   NETEZZA   ORACLE   QLIKVIEW   TABLEAU   TERADATA   HIVE   HUE   MAHOUT   OOZIE   PIG   SQOOP   WHIRR   ZOOKEEPER   HBASE   IMPALA   SEARCH  (BETA)   DEPLOYMENT   MONITORING   API   SNMP   CONFIG  ROLLBACKS   PHONE  HOME   SERVICE  MGMT   DIAGNOSTICS   ROLLING  UPGRADES   LDAP   REPORTING   BACKUP/DR   CLOUDERA  SUPPORT   BEST-­‐IN-­‐CLASS  TECHNICAL  SUPPORT,   COMMUNICTY  ADVOCACY  &   INDEMNIFICATION   CLOUDERA  NAVIGATOR   END-­‐TO-­‐END  DATA  MANAGEMENT   ACCESS  MGMT   DATA  AUDIT   CORE  HADOOP   PROJECTS   CLOUDERA   MANAGER   CLOUDERA   NAVIGATOR   HBASE   IMPALA   Search  
  • 17. 17   RTD  SubscripVon   Includes  Support  &  Indemnity  for  Apache  HBase       CDH   100%  OPEN  SOURCE   HADOOP  DISTRIBUTION   CLOUDERA  MANAGER   END-­‐TO-­‐END  SYSTEM  MANAGEMENT   CORE  PROJECTS   PREMIUM  PROJECTS   CONNECTORS   HDFS   MAPREDUCE   FLUME   HCATALOG   MICROSTRATEGY   NETEZZA   ORACLE   QLIKVIEW   TABLEAU   TERADATA   HIVE   HUE   MAHOUT   OOZIE   PIG   SQOOP   WHIRR   ZOOKEEPER   HBASE   IMPALA   SEARCH  (BETA)   DEPLOYMENT   MONITORING   API   SNMP   CONFIG  ROLLBACKS   PHONE  HOME   SERVICE  MGMT   DIAGNOSTICS   ROLLING  UPGRADES   LDAP   REPORTING   BACKUP/DR   CLOUDERA  SUPPORT   BEST-­‐IN-­‐CLASS  TECHNICAL  SUPPORT,   COMMUNICTY  ADVOCACY  &   INDEMNIFICATION   CLOUDERA  NAVIGATOR   END-­‐TO-­‐END  DATA  MANAGEMENT   ACCESS  MGMT   DATA  AUDIT   CORE  HADOOP   PROJECTS   CLOUDERA   MANAGER   CLOUDERA   NAVIGATOR   HBASE   IMPALA   Search  
  • 18. 18   RTQ  SubscripVon   Includes  Support  &  Indemnity  for  Cloudera  Impala       CDH   100%  OPEN  SOURCE   HADOOP  DISTRIBUTION   CLOUDERA  MANAGER   END-­‐TO-­‐END  SYSTEM  MANAGEMENT   CORE  PROJECTS   PREMIUM  PROJECTS   CONNECTORS   HDFS   MAPREDUCE   FLUME   HCATALOG   MICROSTRATEGY   NETEZZA   ORACLE   QLIKVIEW   TABLEAU   TERADATA   HIVE   HUE   MAHOUT   OOZIE   PIG   SQOOP   WHIRR   ZOOKEEPER   HBASE   IMPALA   SEARCH  (BETA)   DEPLOYMENT   MONITORING   API   SNMP   CONFIG  ROLLBACKS   PHONE  HOME   SERVICE  MGMT   DIAGNOSTICS   ROLLING  UPGRADES   LDAP   REPORTING   BACKUP/DR   CLOUDERA  SUPPORT   BEST-­‐IN-­‐CLASS  TECHNICAL  SUPPORT,   COMMUNICTY  ADVOCACY  &   INDEMNIFICATION   CLOUDERA  NAVIGATOR   END-­‐TO-­‐END  DATA  MANAGEMENT   ACCESS  MGMT   DATA  AUDIT   CORE  HADOOP   PROJECTS   CLOUDERA   MANAGER   CLOUDERA   NAVIGATOR   HBASE   IMPALA   Search  
  • 19. 19   RTS  SubscripVon   Includes  Support  &  Indemnity  for  Cloudera  Search       CDH   100%  OPEN  SOURCE   HADOOP  DISTRIBUTION   CLOUDERA  MANAGER   END-­‐TO-­‐END  SYSTEM  MANAGEMENT   CORE  PROJECTS   PREMIUM  PROJECTS   CONNECTORS   HDFS   MAPREDUCE   FLUME   HCATALOG   MICROSTRATEGY   NETEZZA   ORACLE   QLIKVIEW   TABLEAU   TERADATA   HIVE   HUE   MAHOUT   OOZIE   PIG   SQOOP   WHIRR   ZOOKEEPER   HBASE   IMPALA   SEARCH  (BETA)   DEPLOYMENT   MONITORING   API   SNMP   CONFIG  ROLLBACKS   PHONE  HOME   SERVICE  MGMT   DIAGNOSTICS   ROLLING  UPGRADES   LDAP   REPORTING   BACKUP/DR   CLOUDERA  SUPPORT   BEST-­‐IN-­‐CLASS  TECHNICAL  SUPPORT,   COMMUNICTY  ADVOCACY  &   INDEMNIFICATION   CLOUDERA  NAVIGATOR   END-­‐TO-­‐END  DATA  MANAGEMENT   ACCESS  MGMT   DATA  AUDIT   CORE  HADOOP   PROJECTS   CLOUDERA   MANAGER   CLOUDERA   NAVIGATOR   HBASE   Search  IMPALA  
  • 20. 20   BDR  SubscripVon   Includes  Centralized  Management  For  Disaster  Recovery  Workflows       CDH   100%  OPEN  SOURCE   HADOOP  DISTRIBUTION   CLOUDERA  MANAGER   END-­‐TO-­‐END  SYSTEM  MANAGEMENT   CORE  PROJECTS   PREMIUM  PROJECTS   CONNECTORS   HDFS   MAPREDUCE   FLUME   HCATALOG   MICROSTRATEGY   NETEZZA   ORACLE   QLIKVIEW   TABLEAU   TERADATA   HIVE   HUE   MAHOUT   OOZIE   PIG   SQOOP   WHIRR   ZOOKEEPER   HBASE   IMPALA   SEARCH  (BETA)   DEPLOYMENT   MONITORING   API   SNMP   CONFIG  ROLLBACKS   PHONE  HOME   SERVICE  MGMT   DIAGNOSTICS   ROLLING  UPGRADES   LDAP   REPORTING   BACKUP/DR   CLOUDERA  SUPPORT   BEST-­‐IN-­‐CLASS  TECHNICAL  SUPPORT,   COMMUNICTY  ADVOCACY  &   INDEMNIFICATION   CLOUDERA  NAVIGATOR   END-­‐TO-­‐END  DATA  MANAGEMENT   ACCESS  MGMT   DATA  AUDIT   CORE  HADOOP   PROJECTS   CLOUDERA   MANAGER   CLOUDERA   NAVIGATOR   HBASE   IMPALA   Search  
  • 21. 21   Navigator  SubscripVon   Enables  Cloudera  Navigator  for  Automated  Data  Management       CDH   100%  OPEN  SOURCE   HADOOP  DISTRIBUTION   CLOUDERA  MANAGER   END-­‐TO-­‐END  SYSTEM  MANAGEMENT   CORE  PROJECTS   PREMIUM  PROJECTS   CONNECTORS   HDFS   MAPREDUCE   FLUME   HCATALOG   MICROSTRATEGY   NETEZZA   ORACLE   QLIKVIEW   TABLEAU   TERADATA   HIVE   HUE   MAHOUT   OOZIE   PIG   SQOOP   WHIRR   ZOOKEEPER   HBASE   IMPALA   SEARCH  (BETA)   DEPLOYMENT   MONITORING   API   SNMP   CONFIG  ROLLBACKS   PHONE  HOME   SERVICE  MGMT   DIAGNOSTICS   ROLLING  UPGRADES   LDAP   REPORTING   BACKUP/DR   CLOUDERA  SUPPORT   BEST-­‐IN-­‐CLASS  TECHNICAL  SUPPORT,   COMMUNICTY  ADVOCACY  &   INDEMNIFICATION   CLOUDERA  NAVIGATOR   END-­‐TO-­‐END  DATA  MANAGEMENT   ACCESS  MGMT   DATA  AUDIT   CORE  HADOOP   PROJECTS   CLOUDERA   MANAGER   CLOUDERA   NAVIGATOR   HBASE   IMPALA   Search  
  • 23. A  mul,na,onal  bank  saves  millions  by   op,mizing  DW  for  analy,cs  &  reducing  data   storage  costs  by  99%.     Ask  Bigger  Ques,ons:   How  can  we  op,mize  our   data  warehouse  investment?  
  • 24. Cloudera  op,mizes  the  EDW,  saves  millions   24   The  Challenge:   •  Teradata  EDW  at  capacity:  ETL  processes  consume  7  days;  takes  5  weeks  to   make  historical  data  available  for  analysis   •  Performance  issues  in  business  cri,cal  apps;  liqle  room  for  discovery,  analy,cs,   ROI  from  opportuni,es   Mul,na,onal  bank  saves  millions  by   op,mizing  exis,ng  DW  for  analy,cs  &   reducing  data  storage  costs  by  99%.   The  Solu,on:   •  Cloudera  Enterprise  offloads  data   storage,  processing  &  some   analy,cs  from  EDW   •  Teradata  can  focus  on  opera,onal   func,ons  &  analy,cs  
  • 25. A  Semiconductor  Manufacturer  uses     predic,ve  analy,cs  to  take  preventa,ve  ac,on   on  chips  likely  to  fail.   Ask  Bigger  Ques,ons:   Which  semiconductor   chips  will  fail?  
  • 26. Cloudera  enables  beqer  predic,ons   26   The  Challenge:   •  Want  to  capture  greater  granular  and  historical  data  for  more  accurate   predic,ve  yield  modeling   •  Storing  9  months’  data  on  Oracle  is  expensive       Semiconductor  manufacturer  can   prevent  chip  failure  with  more   accurate  predic,ve  yield  models.   The  Solu,on:   • Dell  |  Cloudera  solu,on  for  Apache   Hadoop   • 53  nodes;  plan  to  store  up  to  10   years  (~10PB)   • Capturing  &  processing  data  from   each  phase  of  manufacturing  process   CONFIDENTIAL  -­‐  RESTRICTED  
  • 27. The  quant  risk  LOB  within  a  mul,na,onal  bank   saves  millions  through  beqer  risk  exposure   analysis  &  fraud  preven,on.   Ask  Bigger  Ques,ons:   How  can  we  prevent   fraud?  
  • 28. Cloudera  delivers  savings  through  fraud  preven,on   28   The  Challenge:   •  Fraud  detec,on  is  a  cumbersome,  mul,-­‐step  analy,c  process  requiring  data   sampling   •  2B  transac,ons/month  necessitate  constant  revisions  to  risk  profiles   •  Highly  tuned  100TB  Teradata  DW  drives  over-­‐budget  capital  reserves  &  lower   investment  returns   Quant  risk  LOB  in  mul,na,onal  bank   saves  millions  through  beqer  risk   exposure  analysis  &  fraud  preven,on   The  Solu,on:   •  Cloudera  Enterprise  data  factory  for   fraud  preven,on,  credit  &   opera,onal  risk  analysis   •  Look  at  every  incidence  of  fraud  for   5  years  for  each  person   •  Reduced  costs;  expensive  CPU  no   longer  consumed  by  data  processing  
  • 29. BlackBerry  eliminates  data  sampling  &   simplifies  data  processing  for  beqer,  more   comprehensive  analysis.   Ask  Bigger  Ques,ons:   How  do  we  retain  customers   in  a  compe,,ve  market?  
  • 30. Cloudera  delivers  ROI  through  storage  alone   30   The  Challenge:   •  BlackBerry  Services  generates  .5PB  (50-­‐60TB  compressed)  data  per  day   •  RDBMS  is  expensive  –  limited  to  1%  data  sampling  for  analy,cs   BlackBerry  can  analyze  all  their  data   vs.  relying  on  1%  sample  for  beqer   network  capacity  trending  &   management.   The  Solu,on:   •  Cloudera  Enterprise  manages  global   data  set  of  ~100PB   •  Collec,ng  device  content,  machine-­‐ generated  log  data,  audit  details   •  90%  ETL  code  base  reduc,on  
  • 31. 31 A  global  retailer’s  customers  benefit  from   more  personalized  communica,ons  and  offers   based  on  interac,ons  across  all  channels.     Ask  Bigger  Ques,ons:   How  can  we  offer  customers   the  best  experience?  
  • 32. Cloudera  op,mizes  the  DW  for  improved  ROI   32   Global  retailer’s  customers  benefit   from  more  personalized   communica,ons  based  on   interac,ons  across  all  channels.   The  Solu,on:   •  Cloudera  Enterprise  with  Impala  —   1PB  over  250  nodes   •  Consolidated  pla[orm  for  Big  Data   with  single  environment  for  query   and  machine  learning             CONFIDENTIAL  -­‐  RESTRICTED   The  Challenge:   •   Need  to  correlate  online/offline  data  across  disparate,  costly  legacy  DWs   •   Data  takes  up  to  4  weeks  to  get  data  from  one  group  –  inhibits  produc,vity    
  • 33. 33 Any  Ques,ons,  Big  or  Small?