SlideShare a Scribd company logo
1 of 4
Download to read offline
CUSTOMER CASE STUDY




                                        Experian: Transforming the
                                        Marketing Landscape with Cloudera


                                        Company Overview
KEY HIGHLIGHTS                          With 15,000+ employees and annual revenues exceeding $4 billion (USD), Experian is a
                                        global leader in credit reporting and marketing services. The company is comprised of four
INDUSTRY                                main business units: Credit Information Services, Decision Analytics, Business Information
Digital Marketing                       Services, and Marketing Services.

LOCATION                                Experian Marketing Services (EMS) helps marketers connect with customers through
Schaumburg, IL, USA                     relevant communications across a variety of channels, driven by advanced analytics on
                                        an extensive database of geographic, demographic, and lifestyle data.
BUSINESS APPLICATIONS SUPPORTED
> 	Matching engine for CCIR engine,
                                        Business Challenges Before Cloudera
  facilitating a holistic and current
  view of consumers                     EMS has built its business on the effective collection, analysis and use of data. As
                                        Jeff Hassemer, VP of product strategy for EMS explained, “Experian has handled large
IMPACT                                  amounts of data for a very long time: who consumers are, how they’re connected, how
> 	50x performance gains                they interact. We’ve done this over billions and quadrillions of records over time. But
> 	Processing 500% more                 with the proliferation of channels and information that are now flowing into client
  matches per day                       organizations — social media likes, web interactions, email responses — that data has
> 	Deployment in < 6 months             gotten so large that it’s maxed the capacity of older systems. We needed to leap forward
                                        in our processing ability. We wanted to process data orders of magnitude faster so we
                                        could react to tomorrow’s consumer.”

                                        In the past, it was normal to send customer database updates to clients once monthly
                                        for campaign adjustments, allowing Experian to process large volumes of data through
                                        a number of diverse platforms, mostly mainframe based. “We weren’t required to provide
                                        data in real time. We weren’t required to provide the level of volume in terms of the
                                        growth rates we’ve seen from our storage and our data. It’s been a total paradigm shift
                                        that compelled us to look at other solutions,” explained Emad Georgy, CTO for Experian
                                        Marketing Services.

                                        Today’s consumers leave a digital trail of behaviors and preferences for marketers to
                                        leverage so they can enhance the customer experience. Experian’s clients have started
                                        asking for more frequent updates on consumers’ latest purchasing behaviors, online
                                        browsing patterns and social media activity so they can respond in real time. “We serve
                                        many of the top retail companies in the world, and they’re increasingly looking for a single,
                                        integrated view of their customer,” noted Georgy. “If a customer is walking into a store
                                        in Burlington, MA, is that same customer now liking the company on Facebook? Are they
                                        tweeting? We’re looking for an integrated view of who that person is so we can determine
                                        how to message them in the right way.”
CUSTOMER CASE STUDY | 2




                                          But the data exhaust from these digital channels is massive and requires a technological
KEY HIGHLIGHTS                            infrastructure that can accommodate rapid processing, large-scale storage, and flexible
                                          analysis of multi-structured data. Experian’s mainframes were hitting the tipping point
TECHNOLOGIES IN USE                       in terms of performance, flexibility and scalability. Given the need for immediacy of
> 	Hadoop Platform: Cloudera Enterprise   information and customization of data in real time for clients, EMS set an internal goal
> 	Hadoop Components: HBase, Hive,        to process more than 100 million records of data per hour. That translates to 28,000
  Hue, MapReduce, Pig                     records per second.
> 	Servers: HP DL380
> 	Data Warehouse: IBM DB2                “Instead of trying to fit a square peg in a round hole, we went out and decided to look for
> 	Data Marts: Oracle, SQL Server,        new architectures that can handle the new volumes of data that we manage,” said Joe
  Sybase IQ                               McCullough, IT business analyst at EMS. The team identified about 30 criteria for the new
                                          platform, ranging from depth and breadth of offering to support capabilities to price to
BIG DATA SCALE                            unique distribution features. They prioritized two criteria above the rest:
> 	5B rows of data, growing tenfold
> 	Processing 100M records per hour       • Both batch and real-time data processing capabilities
> 	35 CDH nodes today
                                          • Scalability to accommodate large and growing data volumes
ADVICE TO NEW HADOOP USERS                “We compared Hadoop as well as HBase to a number of other options in the industry,”
> Involve expert architects early on
                                          said Georgy. “The North America Experian Marketing Services group has organically
>  ave patience  get trained on
  H
                                          led the evaluation of NoSQL technologies within Experian.” Hadoop and HBase quickly
  Hadoop/HBase before building
                                          surfaced as a natural fit for Experian’s needs. EMS engineers downloaded raw Apache
  applications
                                          Hadoop, but quickly saw the gaps that could be filled by a commercial distribution.
                                          EMS critiqued several distributions and “found that, by and far, Cloudera was in the lead.
                                          We went with Cloudera for a number of reasons, primarily being the strength of the
                                          distribution and the features that CDH gives us.”

                                          EMS’ enterprise-level Hadoop needs, such as meeting client SLAs and having 24x7
                                          reliability, led the organization to invest in Cloudera Enterprise, which is comprised of
                                          three things: Cloudera’s open source Hadoop stack (CDH), a powerful management
                                          toolkit (Cloudera Manager), and expert technical support.


                                          Use Case
                                          A few months of exploring Hadoop translated into a production version of Experian’s
                                          Cross-Channel Identity Resolution (CCIR) engine: a linkage engine that is used to keep
                                          a persistent repository of client touch points. CCIR runs on HBase, resolving needs for
                                          persistency, redundancy, and the ability to automatically redistribute data. HBase offers
                                          a shared architecture that is distributed, fault tolerant, and optimized for storage. And
                                          most importantly, HBase enables both batch and real-time data processing.

                                          Experian feeds data into the CDH-powered CCIR engine using custom extract, transform,
                                          load (ETL) scripts from in-house mainframes and relational databases including IBM DB2,
                                          Oracle, SQL Server, and Sybase IQ. EMS’ HBase system currently spans five billion rows
                                          of data, “and we expect that number to grow tenfold in the near future,” said Paul Perry,
                                          EMS’ director of software. Experian also uses Hive and Pig, primarily for QA and
                                          development purposes. EMS currently has 35 Hadoop nodes across its production
                                          and development clusters.
CUSTOMER CASE STUDY | 3




                                         Impact: Operational Efficiency
 eploying Cloudera allows us to
 D
                                         Hadoop is delivering operational efficiency to Experian by accelerating processing
 process orders of magnitude more
                                         performance by 50x. And the cost of their new infrastructure is only a fraction of the
 information through our systems,
                                         legacy environment. Georgy said, “We’ve been very happy with the implementation of
 and that technological capability in
                                         Hadoop. We’re less than six months in and are already closing the gap on our 100 million
 combination with Experian’s expertise
                                         record per hour goal.” In comparison, EMS used to process 50 million matches per day.
 in bringing together data assets is
 driving new, real insights into         That translates to a 500% improvement.
 tomorrow’s marketing environments.
                                         Further, Cloudera Enterprise allows Experian to get maximum operational efficiency
JEFF HASSEMER, VP PRODUCT STRATEGY,      out of their Hadoop clusters. “Cloudera Enterprise gives us an easy way to manage
EXPERIAN MARKETING SERVICES              multiple clusters, and because the use cases for clients vary so much, we have to do a lot
                                         of tweaking on the platform to get the performance we need. The ability to store different
                                         configuration settings and actually version those settings is huge for us,” said Perry.

                                         McCullough added, “Not only has Cloudera Manager simplified our process, but it’s
                                         made it possible at all. Without a Linux background, I would not have been able to deploy
                                         Hadoop across a cluster and configure it and have anything up and running in nearly
                                         the timeframe that we had.”

                                         Cloudera Manager delivers the following operational benefits to Experian:

                                         • Monitors services running on cluster

                                         • Reports when servers are unhealthy, services have stopped, and/or nodes are bad

                                         • Automates distribution across the cluster easily

                                         • Monitors CPU usage across various applications and data storage availability

                                         • Provides a single portal to see into all cluster details

                                         Perry summarized the project by saying, “We haven’t done anything this cool for a long
                                         time. Our developers are excited about it. I’m excited about it. Senior management is
                                         excited about it. And the experience we’ve had with Cloudera — I don’t think it could be
                                         better. It’s been great working with those guys and the architects that they’ve given us
                                         access to are extremely knowledgeable. The only regret I have is that we didn’t bring
                                         them in sooner.”


                                         Impact: Driving Competitive Advantage
                                         “Our Hadoop infrastructure has become a real transformational change. Deploying
                                         Cloudera allows us to process orders of magnitude more information through our
                                         systems, and that technological capability in combination with Experian’s expertise in
                                         bringing together data assets is driving new, real insights into tomorrow’s marketing
                                         environments,” stated Hassemer. “Nobody is doing what we’re doing with Hadoop today,
                                         especially at this order of magnitude. Ours is the first data management platform of its
                                         kind that accepts data, links information together across an entire marketing ecosystem,
                                         and puts it into a usable format for a solid customer experience.”
CUSTOMER CASE STUDY | 4




                                                         The performance gains offered by Cloudera give Experian new insights and more flexible
 ne of the key enablers that investing in
 O                                                       ways of understanding consumers. EMS no longer relies on a postal address to identify
 Hadoop with Cloudera will empower is                    a consumer; they can now match social media IDs, email addresses, web cookies, phone
 allowing our clients to become obsessed                 numbers and more. With the broader match set that Hadoop enables, EMS’ clients have
 with creating the perfect customer                      a more accurate, current view of who their customers are across multiple channels so
 experience. This tool allows us to                      they can have better, more informed interactions.
 operate at the speed of business, at the
 speed where those interactions occur, so                “A consumer might give their email and sign up for a program with one particular brand,”
 our clients can meet their customers with               explained Hassemer. “The brand doesn’t know much about the consumer at that point,
 the right message at the right time.                   thus information sent to the consumer is very bland, not very enticing. The consumer’s
                                                         time is wasted because the emails they receive aren’t relevant. By offering a holistic
JEFF HASSEMER, VP PRODUCT STRATEGY,
                                                         view of the customer in real time, we can increase the relevance of offers that go to the
EXPERIAN MARKETING SERVICES
                                                         consumer so they say, ‘This is why I signed up.’ And they’re going to interact with that
                                                         brand a lot more. One of the key enablers that investing in Hadoop with Cloudera will
                                                         empower is allowing our clients to become obsessed with creating the perfect customer
                                                         experience. This tool allows us to operate at the speed of business, at the speed where
                                                         those interactions occur, so our clients can meet their customers with the right
                                                         message at the right time.”




                           Cloudera, Inc. 220 Portage Avenue, Palo Alto, CA 94306 USA | 1-888-789-1488 or 1-650-362-0488 | cloudera.com
                           ©2012 Cloudera, Inc. All rights reserved. Cloudera and the Cloudera logo are trademarks or registered trademarks of Cloudera Inc. in the USA and other countries.
                           All other trademarks are the property of their respective companies. Information is subject to change without notice.

More Related Content

What's hot

The Evolution of Platforms - Drew Kurth and Matt Comstock
The Evolution of Platforms - Drew Kurth and Matt ComstockThe Evolution of Platforms - Drew Kurth and Matt Comstock
The Evolution of Platforms - Drew Kurth and Matt Comstock
Razorfish
 
Workbooks Lyonsdown casestudy
Workbooks Lyonsdown casestudyWorkbooks Lyonsdown casestudy
Workbooks Lyonsdown casestudy
Paul Tompsett
 
2013 Oracle Publishing Media Kit Final
2013 Oracle Publishing Media Kit Final2013 Oracle Publishing Media Kit Final
2013 Oracle Publishing Media Kit Final
ShaunMehr
 
Module 2 Orchestration and Interaction Final
Module 2 Orchestration and Interaction FinalModule 2 Orchestration and Interaction Final
Module 2 Orchestration and Interaction Final
Vivastream
 
Disney Parks: Creating Guest Magic Through Analytics
Disney Parks: Creating Guest Magic Through AnalyticsDisney Parks: Creating Guest Magic Through Analytics
Disney Parks: Creating Guest Magic Through Analytics
Juan Gorricho
 

What's hot (19)

Getting Past the Hype about Customer Data Platforms - David Raab
Getting Past the Hype about Customer Data Platforms - David RaabGetting Past the Hype about Customer Data Platforms - David Raab
Getting Past the Hype about Customer Data Platforms - David Raab
 
SQL Server Data Mining - Taking your Application Design to the Next Level
SQL Server Data Mining - Taking your Application Design to the Next LevelSQL Server Data Mining - Taking your Application Design to the Next Level
SQL Server Data Mining - Taking your Application Design to the Next Level
 
Customer prosys
Customer prosysCustomer prosys
Customer prosys
 
ASI Financials 09 Brochure
ASI Financials 09 BrochureASI Financials 09 Brochure
ASI Financials 09 Brochure
 
Microsoft Data Mining 2012
Microsoft Data Mining 2012Microsoft Data Mining 2012
Microsoft Data Mining 2012
 
IBM Watson Explorer for inbound call centers
IBM Watson Explorer for inbound call centersIBM Watson Explorer for inbound call centers
IBM Watson Explorer for inbound call centers
 
Article Services Guide from Reprints Desk
Article Services Guide from Reprints DeskArticle Services Guide from Reprints Desk
Article Services Guide from Reprints Desk
 
The Evolution of Platforms - Drew Kurth and Matt Comstock
The Evolution of Platforms - Drew Kurth and Matt ComstockThe Evolution of Platforms - Drew Kurth and Matt Comstock
The Evolution of Platforms - Drew Kurth and Matt Comstock
 
Workbooks Lyonsdown casestudy
Workbooks Lyonsdown casestudyWorkbooks Lyonsdown casestudy
Workbooks Lyonsdown casestudy
 
SQL Server: Data Mining
SQL Server: Data MiningSQL Server: Data Mining
SQL Server: Data Mining
 
The Big Picture: Big Data for the New Wave of Analytics
The Big Picture: Big Data for the New Wave of AnalyticsThe Big Picture: Big Data for the New Wave of Analytics
The Big Picture: Big Data for the New Wave of Analytics
 
2013 Oracle Publishing Media Kit Final
2013 Oracle Publishing Media Kit Final2013 Oracle Publishing Media Kit Final
2013 Oracle Publishing Media Kit Final
 
Big Data Marketing in the AWS Cloud: Improving Cross-Media Effectiveness - We...
Big Data Marketing in the AWS Cloud: Improving Cross-Media Effectiveness - We...Big Data Marketing in the AWS Cloud: Improving Cross-Media Effectiveness - We...
Big Data Marketing in the AWS Cloud: Improving Cross-Media Effectiveness - We...
 
Module 2 Orchestration and Interaction Final
Module 2 Orchestration and Interaction FinalModule 2 Orchestration and Interaction Final
Module 2 Orchestration and Interaction Final
 
Oracle Publishing Group Media Kit 2013
Oracle Publishing Group Media Kit 2013Oracle Publishing Group Media Kit 2013
Oracle Publishing Group Media Kit 2013
 
Disney Parks: Creating Guest Magic Through Analytics
Disney Parks: Creating Guest Magic Through AnalyticsDisney Parks: Creating Guest Magic Through Analytics
Disney Parks: Creating Guest Magic Through Analytics
 
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of SybaseBig Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
 
Daum Communications Case Study
Daum Communications Case StudyDaum Communications Case Study
Daum Communications Case Study
 
Envision IT Seminar Presentation - Microsoft Office 365
Envision IT Seminar Presentation - Microsoft Office 365 Envision IT Seminar Presentation - Microsoft Office 365
Envision IT Seminar Presentation - Microsoft Office 365
 

Viewers also liked

1 o que realmente importa
1   o que realmente importa1   o que realmente importa
1 o que realmente importa
Edmila Coelho
 
NPM.2011-12.AnnualReport
NPM.2011-12.AnnualReportNPM.2011-12.AnnualReport
NPM.2011-12.AnnualReport
Rishi Moudgil
 
Guthrie Munion - Profile (2015)
Guthrie Munion - Profile (2015)Guthrie Munion - Profile (2015)
Guthrie Munion - Profile (2015)
Guthrie Munion
 
4. Exclusividade 2.0 - Diego Simon - VivaReal - Distrito Federal - Brasília
4. Exclusividade 2.0 - Diego Simon - VivaReal - Distrito Federal - Brasília4. Exclusividade 2.0 - Diego Simon - VivaReal - Distrito Federal - Brasília
4. Exclusividade 2.0 - Diego Simon - VivaReal - Distrito Federal - Brasília
Guru do Corretor
 
Apresentação futelazer 2013 5
Apresentação futelazer 2013 5Apresentação futelazer 2013 5
Apresentação futelazer 2013 5
Rose Paola
 
Guia básico como-pequenas-empresas-podem-fazer-grandes-negócios-com-o-facebook
Guia básico como-pequenas-empresas-podem-fazer-grandes-negócios-com-o-facebookGuia básico como-pequenas-empresas-podem-fazer-grandes-negócios-com-o-facebook
Guia básico como-pequenas-empresas-podem-fazer-grandes-negócios-com-o-facebook
Guilherme Nascimento
 

Viewers also liked (20)

MRI Presentation
MRI PresentationMRI Presentation
MRI Presentation
 
1 o que realmente importa
1   o que realmente importa1   o que realmente importa
1 o que realmente importa
 
Presentation1
Presentation1Presentation1
Presentation1
 
uMobi: Concept presentation
uMobi: Concept presentationuMobi: Concept presentation
uMobi: Concept presentation
 
4.Mercado Imobiliário na Internet - Exclusividade 2 - Diego Simon - VivaReal ...
4.Mercado Imobiliário na Internet - Exclusividade 2 - Diego Simon - VivaReal ...4.Mercado Imobiliário na Internet - Exclusividade 2 - Diego Simon - VivaReal ...
4.Mercado Imobiliário na Internet - Exclusividade 2 - Diego Simon - VivaReal ...
 
NPM.2011-12.AnnualReport
NPM.2011-12.AnnualReportNPM.2011-12.AnnualReport
NPM.2011-12.AnnualReport
 
How To Write Compelling Subject Lines
How To Write Compelling Subject LinesHow To Write Compelling Subject Lines
How To Write Compelling Subject Lines
 
Question 6 Media Eval
Question 6 Media EvalQuestion 6 Media Eval
Question 6 Media Eval
 
Guthrie Munion - Profile (2015)
Guthrie Munion - Profile (2015)Guthrie Munion - Profile (2015)
Guthrie Munion - Profile (2015)
 
Paper Aeroplane Pitch
Paper Aeroplane PitchPaper Aeroplane Pitch
Paper Aeroplane Pitch
 
4. Exclusividade 2.0 - Diego Simon - VivaReal - Distrito Federal - Brasília
4. Exclusividade 2.0 - Diego Simon - VivaReal - Distrito Federal - Brasília4. Exclusividade 2.0 - Diego Simon - VivaReal - Distrito Federal - Brasília
4. Exclusividade 2.0 - Diego Simon - VivaReal - Distrito Federal - Brasília
 
3sixty email design by tammi st. angelo
3sixty email design by tammi st. angelo3sixty email design by tammi st. angelo
3sixty email design by tammi st. angelo
 
Smart income
Smart incomeSmart income
Smart income
 
Apresentação futelazer 2013 5
Apresentação futelazer 2013 5Apresentação futelazer 2013 5
Apresentação futelazer 2013 5
 
Incentivo fiscal icms_apresentacao7
Incentivo fiscal icms_apresentacao7Incentivo fiscal icms_apresentacao7
Incentivo fiscal icms_apresentacao7
 
Muzik juara
Muzik juaraMuzik juara
Muzik juara
 
Design for Your Subscribers
Design for Your SubscribersDesign for Your Subscribers
Design for Your Subscribers
 
Re-engagement Strategies Author Todd Wilson
Re-engagement Strategies Author Todd Wilson  Re-engagement Strategies Author Todd Wilson
Re-engagement Strategies Author Todd Wilson
 
Guia básico como-pequenas-empresas-podem-fazer-grandes-negócios-com-o-facebook
Guia básico como-pequenas-empresas-podem-fazer-grandes-negócios-com-o-facebookGuia básico como-pequenas-empresas-podem-fazer-grandes-negócios-com-o-facebook
Guia básico como-pequenas-empresas-podem-fazer-grandes-negócios-com-o-facebook
 
Pengembangan softskill-mahasiswa
Pengembangan softskill-mahasiswaPengembangan softskill-mahasiswa
Pengembangan softskill-mahasiswa
 

Similar to Cloudera case study_experian

mapr_case_study_experian
mapr_case_study_experianmapr_case_study_experian
mapr_case_study_experian
Erni Susanti
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
Jane Roberts
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
Julian Tong
 

Similar to Cloudera case study_experian (20)

mapr_case_study_experian
mapr_case_study_experianmapr_case_study_experian
mapr_case_study_experian
 
IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data
 
See who is using MemSQL
See who is using MemSQLSee who is using MemSQL
See who is using MemSQL
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 
Hadoop in the Cloud
Hadoop in the CloudHadoop in the Cloud
Hadoop in the Cloud
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
 
DX2000 from NEC lets you put big data to work
DX2000 from NEC lets you put big data to workDX2000 from NEC lets you put big data to work
DX2000 from NEC lets you put big data to work
 
Hooduku - Cloud and Big data practice
Hooduku - Cloud and Big data practiceHooduku - Cloud and Big data practice
Hooduku - Cloud and Big data practice
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWS
 
Big data journey to the cloud maz chaudhri 5.30.18
Big data journey to the cloud   maz chaudhri 5.30.18Big data journey to the cloud   maz chaudhri 5.30.18
Big data journey to the cloud maz chaudhri 5.30.18
 
Hadoop and-cisco-ucs
Hadoop and-cisco-ucsHadoop and-cisco-ucs
Hadoop and-cisco-ucs
 
Accelerating Data Warehouse Modernization
Accelerating Data Warehouse ModernizationAccelerating Data Warehouse Modernization
Accelerating Data Warehouse Modernization
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
 
How Analytics Optimize Migration to Amazon Web Services, Microsoft Azure and ...
How Analytics Optimize Migration to Amazon Web Services, Microsoft Azure and ...How Analytics Optimize Migration to Amazon Web Services, Microsoft Azure and ...
How Analytics Optimize Migration to Amazon Web Services, Microsoft Azure and ...
 
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTBig Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
 
Big data solutions on cloud – the way forward
Big data solutions on cloud – the way forwardBig data solutions on cloud – the way forward
Big data solutions on cloud – the way forward
 
FEALTY TECHNOLOGIES Portfolio - LATEST.pptx
FEALTY TECHNOLOGIES Portfolio - LATEST.pptxFEALTY TECHNOLOGIES Portfolio - LATEST.pptx
FEALTY TECHNOLOGIES Portfolio - LATEST.pptx
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
Hadoop Demo eConvergence
Hadoop Demo eConvergenceHadoop Demo eConvergence
Hadoop Demo eConvergence
 

Cloudera case study_experian

  • 1. CUSTOMER CASE STUDY Experian: Transforming the Marketing Landscape with Cloudera Company Overview KEY HIGHLIGHTS With 15,000+ employees and annual revenues exceeding $4 billion (USD), Experian is a global leader in credit reporting and marketing services. The company is comprised of four INDUSTRY main business units: Credit Information Services, Decision Analytics, Business Information Digital Marketing Services, and Marketing Services. LOCATION Experian Marketing Services (EMS) helps marketers connect with customers through Schaumburg, IL, USA relevant communications across a variety of channels, driven by advanced analytics on an extensive database of geographic, demographic, and lifestyle data. BUSINESS APPLICATIONS SUPPORTED > Matching engine for CCIR engine, Business Challenges Before Cloudera facilitating a holistic and current view of consumers EMS has built its business on the effective collection, analysis and use of data. As Jeff Hassemer, VP of product strategy for EMS explained, “Experian has handled large IMPACT amounts of data for a very long time: who consumers are, how they’re connected, how > 50x performance gains they interact. We’ve done this over billions and quadrillions of records over time. But > Processing 500% more with the proliferation of channels and information that are now flowing into client matches per day organizations — social media likes, web interactions, email responses — that data has > Deployment in < 6 months gotten so large that it’s maxed the capacity of older systems. We needed to leap forward in our processing ability. We wanted to process data orders of magnitude faster so we could react to tomorrow’s consumer.” In the past, it was normal to send customer database updates to clients once monthly for campaign adjustments, allowing Experian to process large volumes of data through a number of diverse platforms, mostly mainframe based. “We weren’t required to provide data in real time. We weren’t required to provide the level of volume in terms of the growth rates we’ve seen from our storage and our data. It’s been a total paradigm shift that compelled us to look at other solutions,” explained Emad Georgy, CTO for Experian Marketing Services. Today’s consumers leave a digital trail of behaviors and preferences for marketers to leverage so they can enhance the customer experience. Experian’s clients have started asking for more frequent updates on consumers’ latest purchasing behaviors, online browsing patterns and social media activity so they can respond in real time. “We serve many of the top retail companies in the world, and they’re increasingly looking for a single, integrated view of their customer,” noted Georgy. “If a customer is walking into a store in Burlington, MA, is that same customer now liking the company on Facebook? Are they tweeting? We’re looking for an integrated view of who that person is so we can determine how to message them in the right way.”
  • 2. CUSTOMER CASE STUDY | 2 But the data exhaust from these digital channels is massive and requires a technological KEY HIGHLIGHTS infrastructure that can accommodate rapid processing, large-scale storage, and flexible analysis of multi-structured data. Experian’s mainframes were hitting the tipping point TECHNOLOGIES IN USE in terms of performance, flexibility and scalability. Given the need for immediacy of > Hadoop Platform: Cloudera Enterprise information and customization of data in real time for clients, EMS set an internal goal > Hadoop Components: HBase, Hive, to process more than 100 million records of data per hour. That translates to 28,000 Hue, MapReduce, Pig records per second. > Servers: HP DL380 > Data Warehouse: IBM DB2 “Instead of trying to fit a square peg in a round hole, we went out and decided to look for > Data Marts: Oracle, SQL Server, new architectures that can handle the new volumes of data that we manage,” said Joe Sybase IQ McCullough, IT business analyst at EMS. The team identified about 30 criteria for the new platform, ranging from depth and breadth of offering to support capabilities to price to BIG DATA SCALE unique distribution features. They prioritized two criteria above the rest: > 5B rows of data, growing tenfold > Processing 100M records per hour • Both batch and real-time data processing capabilities > 35 CDH nodes today • Scalability to accommodate large and growing data volumes ADVICE TO NEW HADOOP USERS “We compared Hadoop as well as HBase to a number of other options in the industry,” > Involve expert architects early on said Georgy. “The North America Experian Marketing Services group has organically > ave patience get trained on H led the evaluation of NoSQL technologies within Experian.” Hadoop and HBase quickly Hadoop/HBase before building surfaced as a natural fit for Experian’s needs. EMS engineers downloaded raw Apache applications Hadoop, but quickly saw the gaps that could be filled by a commercial distribution. EMS critiqued several distributions and “found that, by and far, Cloudera was in the lead. We went with Cloudera for a number of reasons, primarily being the strength of the distribution and the features that CDH gives us.” EMS’ enterprise-level Hadoop needs, such as meeting client SLAs and having 24x7 reliability, led the organization to invest in Cloudera Enterprise, which is comprised of three things: Cloudera’s open source Hadoop stack (CDH), a powerful management toolkit (Cloudera Manager), and expert technical support. Use Case A few months of exploring Hadoop translated into a production version of Experian’s Cross-Channel Identity Resolution (CCIR) engine: a linkage engine that is used to keep a persistent repository of client touch points. CCIR runs on HBase, resolving needs for persistency, redundancy, and the ability to automatically redistribute data. HBase offers a shared architecture that is distributed, fault tolerant, and optimized for storage. And most importantly, HBase enables both batch and real-time data processing. Experian feeds data into the CDH-powered CCIR engine using custom extract, transform, load (ETL) scripts from in-house mainframes and relational databases including IBM DB2, Oracle, SQL Server, and Sybase IQ. EMS’ HBase system currently spans five billion rows of data, “and we expect that number to grow tenfold in the near future,” said Paul Perry, EMS’ director of software. Experian also uses Hive and Pig, primarily for QA and development purposes. EMS currently has 35 Hadoop nodes across its production and development clusters.
  • 3. CUSTOMER CASE STUDY | 3 Impact: Operational Efficiency eploying Cloudera allows us to D Hadoop is delivering operational efficiency to Experian by accelerating processing process orders of magnitude more performance by 50x. And the cost of their new infrastructure is only a fraction of the information through our systems, legacy environment. Georgy said, “We’ve been very happy with the implementation of and that technological capability in Hadoop. We’re less than six months in and are already closing the gap on our 100 million combination with Experian’s expertise record per hour goal.” In comparison, EMS used to process 50 million matches per day. in bringing together data assets is driving new, real insights into That translates to a 500% improvement. tomorrow’s marketing environments. Further, Cloudera Enterprise allows Experian to get maximum operational efficiency JEFF HASSEMER, VP PRODUCT STRATEGY, out of their Hadoop clusters. “Cloudera Enterprise gives us an easy way to manage EXPERIAN MARKETING SERVICES multiple clusters, and because the use cases for clients vary so much, we have to do a lot of tweaking on the platform to get the performance we need. The ability to store different configuration settings and actually version those settings is huge for us,” said Perry. McCullough added, “Not only has Cloudera Manager simplified our process, but it’s made it possible at all. Without a Linux background, I would not have been able to deploy Hadoop across a cluster and configure it and have anything up and running in nearly the timeframe that we had.” Cloudera Manager delivers the following operational benefits to Experian: • Monitors services running on cluster • Reports when servers are unhealthy, services have stopped, and/or nodes are bad • Automates distribution across the cluster easily • Monitors CPU usage across various applications and data storage availability • Provides a single portal to see into all cluster details Perry summarized the project by saying, “We haven’t done anything this cool for a long time. Our developers are excited about it. I’m excited about it. Senior management is excited about it. And the experience we’ve had with Cloudera — I don’t think it could be better. It’s been great working with those guys and the architects that they’ve given us access to are extremely knowledgeable. The only regret I have is that we didn’t bring them in sooner.” Impact: Driving Competitive Advantage “Our Hadoop infrastructure has become a real transformational change. Deploying Cloudera allows us to process orders of magnitude more information through our systems, and that technological capability in combination with Experian’s expertise in bringing together data assets is driving new, real insights into tomorrow’s marketing environments,” stated Hassemer. “Nobody is doing what we’re doing with Hadoop today, especially at this order of magnitude. Ours is the first data management platform of its kind that accepts data, links information together across an entire marketing ecosystem, and puts it into a usable format for a solid customer experience.”
  • 4. CUSTOMER CASE STUDY | 4 The performance gains offered by Cloudera give Experian new insights and more flexible ne of the key enablers that investing in O ways of understanding consumers. EMS no longer relies on a postal address to identify Hadoop with Cloudera will empower is a consumer; they can now match social media IDs, email addresses, web cookies, phone allowing our clients to become obsessed numbers and more. With the broader match set that Hadoop enables, EMS’ clients have with creating the perfect customer a more accurate, current view of who their customers are across multiple channels so experience. This tool allows us to they can have better, more informed interactions. operate at the speed of business, at the speed where those interactions occur, so “A consumer might give their email and sign up for a program with one particular brand,” our clients can meet their customers with explained Hassemer. “The brand doesn’t know much about the consumer at that point, the right message at the right time. thus information sent to the consumer is very bland, not very enticing. The consumer’s time is wasted because the emails they receive aren’t relevant. By offering a holistic JEFF HASSEMER, VP PRODUCT STRATEGY, view of the customer in real time, we can increase the relevance of offers that go to the EXPERIAN MARKETING SERVICES consumer so they say, ‘This is why I signed up.’ And they’re going to interact with that brand a lot more. One of the key enablers that investing in Hadoop with Cloudera will empower is allowing our clients to become obsessed with creating the perfect customer experience. This tool allows us to operate at the speed of business, at the speed where those interactions occur, so our clients can meet their customers with the right message at the right time.” Cloudera, Inc. 220 Portage Avenue, Palo Alto, CA 94306 USA | 1-888-789-1488 or 1-650-362-0488 | cloudera.com ©2012 Cloudera, Inc. All rights reserved. Cloudera and the Cloudera logo are trademarks or registered trademarks of Cloudera Inc. in the USA and other countries. All other trademarks are the property of their respective companies. Information is subject to change without notice.