SlideShare uma empresa Scribd logo
1 de 21
Baixar para ler offline
Razorfish: Use of EMR for Marketing
     Segmentation




© 2009 Razorfish. All rights reserved.
Agenda
• Who we are.
• Razorfish, ATLAS, Microsoft
• ATLAS What is it?, Problems
• AWS – EMR – Why move?
• EMR Solution Outline
• Benefits gained, Opportunities




                                   Page 2 © 2012 Razorfish. All
                                              rights reserved.
Who we are
– Razorfish London is a full-service digital agency.
– Founded in London in 1996
– We are now 250 people strong and experts at creative, design, social
  media, digital media, analytics, technology, service operations and
  user experience.
– We are part of one of the world's largest interactive agency networks
  with more than 2,800 people.
– According to LinkedIn, Razorfish is the 31st most desirable employer in
  the world (even beating Starbucks).
– For the last three years we’ve been the only agency recognised by
  Forrester Research as a ‘leader’ in both the Media & Interactive
  Marketing and Experience Design & Technology categories.
– We are Adobe’s ‘Digital Marketing Global Partner of the Year, 2012’
– We are No. 4 in the last Ad Age ‘Agency A-List’ - the highest ranked
  digital agency.




                                                       Page 3 © 2009 Razorfish. All
                                                                  rights reserved.
RF – Atlas - Microsoft
• Razorfish: Developed the ATLAS ad serving engine
• Atlas was seperated from Razorfish, but had a
symbiotic relationship
• Google bought DoubleClick
• Microsoft bought Aquantive Group
• Microsoft incorporated Atlas into MS Advertising
and Publishing
• Sold Razorfish to Publicis group
• RF continue to have a strong relationship with
Atlas, but have gone on to develop Razorfish Edge,
Insight On Demand (IoD), that use Atlas data
extensively.

                                       Page 4 © 2009 Razorfish. All
                                                  rights reserved.
Atlas
•Razorfish: Developed the ATLAS ad serving
engine
• Single cookie & atlas tags
• 90% of Browsers
• Clickstream analysis of data, current and
historical, log file data. User are placed into
buckets - segmented
• Segmentation used to serve ads and cross
sell

                                    Page 5 © 2009 Razorfish. All
                                               rights reserved.
Problem
45 Terabytes of raw clickstream (log) data
 45 Terabytes of raw clickstream and log data


Business logic and metrics against loosely structured data

   • ROI
   • Custom ROI base on complex, client specific business rules
   • Rich Media and Analytics


Custom user profiling


Custom analysis of web surfing activity


Targeting


                                                                  Page 6 © 2009 Razorfish. All
                                                                             rights reserved.
Problem
• Giant Datasets
• Build infrastructure requires large
continuous investment
• Building for peak/holiday traffic
• Data mining apps / Physical DB’s at or
near limit
• Client expectations/data volumes
increasing

                                   Page 7 © 2009 Razorfish. All
                                              rights reserved.
Previously 2009
•Custom Distributed Log Processing Engine

 • Sorted by cookie_id by time

  • Need to segment granularly across larger no/ segments (Cust || Prospect)
•SQL

 • 60 SQL Server boxes

 • Shared resources (contention issues)

  • In a DR configuration
•OLAP

 • In house constrained

By the end of 2009 (x-mas holiday season), RF needed $500k to keep up with data
processing needs.




                                                                    Page 8 © 2009 Razorfish. All
                                                                               rights reserved.
AWS + EMR
•   Efficient: Elastic infrastructure from AWS allows capacity to be provisioned as
    needed based on load, reducing cost and the risk of processing delays.
•   Configuration: Amazon Elastic MapReduce and Cascading lets Razorfish focus on
    application development without having to worry about time-consuming set-up,
    management, or tuning of clusters or the compute capacity upon which they sit.
•   Ease of integration: Amazon Elastic MapReduce with Cascading allows data
    processing in the cloud without any changes to the underlying algorithms.
•   Flexible: EMR with Cascading is flexible enough to allow “agile” implementation
    and unit testing of sophisticated algorithms.
•   Adaptable: Cascading simplifies the integration of Hadoop with external ad
    systems.
•   Scalable: AWS infrastructure helps Razorfish reliably store and process huge
    (Petabytes) data sets.




                                                                  Page 9 © 2009 Razorfish. All
                                                                             rights reserved.
AWS + EMR


          AWS                               EMR                               Segmentation




•S3 Storage 45tb of log   • Measurement of customer value              • Actionable
data
                          • Measurement of customer affinity           • Rules flexible /
                                                                         customizable
                          • Joining 2.8 billion transactions against
                            known site categorization
                            information

                          • Unbalanced so there is a hit to the
                            reducers




                                                                              Page 10 © 2009 Razorfish. All
                                                                                          rights reserved.
We import a lot of Atlas Data




24 servers
                                          Cloud Storage

                     Upload 200 + GB
                      of data per day




             ( ½ Trillion ICA records )
We filter out the relevant cookies



Cloud Storage                                 Elastic Mapreduce




 100 Machine Cluster Created on demand. We filter for only the
  transactions that we need to process (more than 3.5 billion)




           ( about 71 million unique cookies a day)
Filter by behavior


Filtered Transactions




                                                                           SKU Table
                           Generate list of products that have been seen




                        ( Match these cookies to 100,000’s of skus )
Match to their affinity



                                             Join transactions
                                                  to site genre
                                                   information                    Sport
                                                                                Enthusiast
                                                                   70 million
Filtered Transactions




                                                                  placements



                             Determinee profile information by the
                               types of sites the user has visited




                        ( Cookies are matched to 3.5 billion ICA records )
…and run custom business rules



                                                            Join site
                                                         behavior to    SKU Table
                                                        product info                 In market
                                                                                       Gamer
 Filtered Transactions




                                      Determine the types of products the
                           user is interested from what they have done on the site




( super–computing power determines some key categories )
We bring it all together




category                   affinity               generation

  In market
    Gamer        +         Sport Enthusiast
                                              +    Purchaser Home
                                                       Theater




              ( 1 of N “Personalization” segments )
Drive a personalized message




   User recently purchased
   a home theater system
    and is now looking for   Target Ad
        sports games




              ( 1.7 million per day )
Each and every day




This all happens in about 8 hours every day




                  ( not bad )
AWS + EMR
– Perfect clarity of cost
– No upfront infrastructure investment
– No client processing contention
– We couldn’t have done it.
– Without EMR/Hadoop process takes 3 days and heavy
  reliance on manual processes. Now 5-8hrs
– Elasticity to complete a job faster if it’s worth the cost.
– We can meet our SLA’s




                                                Page 19 © 2009 Razorfish. All
                                                            rights reserved.
Expanding Data Landscape
• EMR allows us to deal with the ever expanding number of
  channels and user interactions with sites and data:
• Clickstream data available from tools like Atlas and
  Doubleclick—who have cookied over 90% of the Internet
• Digital experience tracked through tools like Omniture,
  Webtrends and Google Analytics
• Other channel data across touchpoints (email, call center,
  mobile)
• Client Data
• Transactional data
• Survey-based (Nielsen’s)
• Social data available through open APIs (hosepipes)


                                               Page 20 © 2009 Razorfish. All
                                                           rights reserved.
Thank you



     •Mandhir Gidda




© 2009 Razorfish. All rights reserved.

Mais conteúdo relacionado

Semelhante a Use of EMR for Marketing Segmentation

Using AWS Purpose-Built Databases to Modernize your Applications
Using AWS Purpose-Built Databases to Modernize your ApplicationsUsing AWS Purpose-Built Databases to Modernize your Applications
Using AWS Purpose-Built Databases to Modernize your ApplicationsAmazon Web Services
 
Amazon QuickSight First Call Deck
Amazon QuickSight First Call DeckAmazon QuickSight First Call Deck
Amazon QuickSight First Call DeckAmazon Web Services
 
Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?Clustrix
 
Building with Purpose-Built Databases: Match Your workload to the Right Database
Building with Purpose-Built Databases: Match Your workload to the Right DatabaseBuilding with Purpose-Built Databases: Match Your workload to the Right Database
Building with Purpose-Built Databases: Match Your workload to the Right DatabaseAWS Summits
 
Building Web Applications on AWS - AWS Summit 2012 - NYC
Building Web Applications on AWS - AWS Summit 2012 - NYCBuilding Web Applications on AWS - AWS Summit 2012 - NYC
Building Web Applications on AWS - AWS Summit 2012 - NYCAmazon Web Services
 
Database Freedom. Database migration approaches to get to the Cloud - Marcus ...
Database Freedom. Database migration approaches to get to the Cloud - Marcus ...Database Freedom. Database migration approaches to get to the Cloud - Marcus ...
Database Freedom. Database migration approaches to get to the Cloud - Marcus ...Amazon Web Services
 
Which Database is Right for My Workload: Database Week SF
Which Database is Right for My Workload: Database Week SFWhich Database is Right for My Workload: Database Week SF
Which Database is Right for My Workload: Database Week SFAmazon Web Services
 
Which Database is Right for My Workload?: Database Week San Francisco
Which Database is Right for My Workload?: Database Week San FranciscoWhich Database is Right for My Workload?: Database Week San Francisco
Which Database is Right for My Workload?: Database Week San FranciscoAmazon Web Services
 
Which Database is Right for My Workload?
Which Database is Right for My Workload?Which Database is Right for My Workload?
Which Database is Right for My Workload?Amazon Web Services
 
Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/MLPreparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/MLAmazon Web Services
 
Common MongoDB Use Cases
Common MongoDB Use CasesCommon MongoDB Use Cases
Common MongoDB Use CasesDATAVERSITY
 
Deep dive session - how to achieve database freedom
Deep dive session - how to achieve database freedomDeep dive session - how to achieve database freedom
Deep dive session - how to achieve database freedomRitesh Toshniwal
 
Module 1 - CP Datalake on AWS
Module 1 - CP Datalake on AWSModule 1 - CP Datalake on AWS
Module 1 - CP Datalake on AWSLam Le
 
Cloud Based Business Intelligence with Amazon QuickSight - AWS Online Tech Talks
Cloud Based Business Intelligence with Amazon QuickSight - AWS Online Tech TalksCloud Based Business Intelligence with Amazon QuickSight - AWS Online Tech Talks
Cloud Based Business Intelligence with Amazon QuickSight - AWS Online Tech TalksAmazon Web Services
 
FSI301 An Architecture for Trade Capture and Regulatory Reporting
FSI301 An Architecture for Trade Capture and Regulatory ReportingFSI301 An Architecture for Trade Capture and Regulatory Reporting
FSI301 An Architecture for Trade Capture and Regulatory ReportingAmazon Web Services
 
Common MongoDB Use Cases Webinar
Common MongoDB Use Cases WebinarCommon MongoDB Use Cases Webinar
Common MongoDB Use Cases WebinarMongoDB
 
Immersion Day - Como gerenciar seu catálogo de dados e processo de transform...
Immersion Day -  Como gerenciar seu catálogo de dados e processo de transform...Immersion Day -  Como gerenciar seu catálogo de dados e processo de transform...
Immersion Day - Como gerenciar seu catálogo de dados e processo de transform...Amazon Web Services LATAM
 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino Data Lab
 

Semelhante a Use of EMR for Marketing Segmentation (20)

Using AWS Purpose-Built Databases to Modernize your Applications
Using AWS Purpose-Built Databases to Modernize your ApplicationsUsing AWS Purpose-Built Databases to Modernize your Applications
Using AWS Purpose-Built Databases to Modernize your Applications
 
Amazon QuickSight First Call Deck
Amazon QuickSight First Call DeckAmazon QuickSight First Call Deck
Amazon QuickSight First Call Deck
 
Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?Benchmark Showdown: Which Relational Database is the Fastest on AWS?
Benchmark Showdown: Which Relational Database is the Fastest on AWS?
 
Building with Purpose-Built Databases: Match Your workload to the Right Database
Building with Purpose-Built Databases: Match Your workload to the Right DatabaseBuilding with Purpose-Built Databases: Match Your workload to the Right Database
Building with Purpose-Built Databases: Match Your workload to the Right Database
 
Building Web Applications on AWS - AWS Summit 2012 - NYC
Building Web Applications on AWS - AWS Summit 2012 - NYCBuilding Web Applications on AWS - AWS Summit 2012 - NYC
Building Web Applications on AWS - AWS Summit 2012 - NYC
 
Database Freedom. Database migration approaches to get to the Cloud - Marcus ...
Database Freedom. Database migration approaches to get to the Cloud - Marcus ...Database Freedom. Database migration approaches to get to the Cloud - Marcus ...
Database Freedom. Database migration approaches to get to the Cloud - Marcus ...
 
Which Database is Right for My Workload: Database Week SF
Which Database is Right for My Workload: Database Week SFWhich Database is Right for My Workload: Database Week SF
Which Database is Right for My Workload: Database Week SF
 
Which Database is Right for My Workload?: Database Week San Francisco
Which Database is Right for My Workload?: Database Week San FranciscoWhich Database is Right for My Workload?: Database Week San Francisco
Which Database is Right for My Workload?: Database Week San Francisco
 
Which Database is Right for My Workload?
Which Database is Right for My Workload?Which Database is Right for My Workload?
Which Database is Right for My Workload?
 
Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/MLPreparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/ML
 
Common MongoDB Use Cases
Common MongoDB Use CasesCommon MongoDB Use Cases
Common MongoDB Use Cases
 
Deep dive session - how to achieve database freedom
Deep dive session - how to achieve database freedomDeep dive session - how to achieve database freedom
Deep dive session - how to achieve database freedom
 
Module 1 - CP Datalake on AWS
Module 1 - CP Datalake on AWSModule 1 - CP Datalake on AWS
Module 1 - CP Datalake on AWS
 
Managed NoSQL databases
Managed NoSQL databasesManaged NoSQL databases
Managed NoSQL databases
 
Cloud Based Business Intelligence with Amazon QuickSight - AWS Online Tech Talks
Cloud Based Business Intelligence with Amazon QuickSight - AWS Online Tech TalksCloud Based Business Intelligence with Amazon QuickSight - AWS Online Tech Talks
Cloud Based Business Intelligence with Amazon QuickSight - AWS Online Tech Talks
 
FSI301 An Architecture for Trade Capture and Regulatory Reporting
FSI301 An Architecture for Trade Capture and Regulatory ReportingFSI301 An Architecture for Trade Capture and Regulatory Reporting
FSI301 An Architecture for Trade Capture and Regulatory Reporting
 
Common MongoDB Use Cases Webinar
Common MongoDB Use Cases WebinarCommon MongoDB Use Cases Webinar
Common MongoDB Use Cases Webinar
 
AWS Database Services @ Scale
AWS Database Services @ ScaleAWS Database Services @ Scale
AWS Database Services @ Scale
 
Immersion Day - Como gerenciar seu catálogo de dados e processo de transform...
Immersion Day -  Como gerenciar seu catálogo de dados e processo de transform...Immersion Day -  Como gerenciar seu catálogo de dados e processo de transform...
Immersion Day - Como gerenciar seu catálogo de dados e processo de transform...
 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...
 

Mais de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mais de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Último

UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
RAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIRAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIUdaiappa Ramachandran
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?SANGHEE SHIN
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceMartin Humpolec
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
Babel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxBabel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxYounusS2
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 

Último (20)

UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
RAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIRAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AI
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your Salesforce
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
Babel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxBabel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptx
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 

Use of EMR for Marketing Segmentation

  • 1. Razorfish: Use of EMR for Marketing Segmentation © 2009 Razorfish. All rights reserved.
  • 2. Agenda • Who we are. • Razorfish, ATLAS, Microsoft • ATLAS What is it?, Problems • AWS – EMR – Why move? • EMR Solution Outline • Benefits gained, Opportunities Page 2 © 2012 Razorfish. All rights reserved.
  • 3. Who we are – Razorfish London is a full-service digital agency. – Founded in London in 1996 – We are now 250 people strong and experts at creative, design, social media, digital media, analytics, technology, service operations and user experience. – We are part of one of the world's largest interactive agency networks with more than 2,800 people. – According to LinkedIn, Razorfish is the 31st most desirable employer in the world (even beating Starbucks). – For the last three years we’ve been the only agency recognised by Forrester Research as a ‘leader’ in both the Media & Interactive Marketing and Experience Design & Technology categories. – We are Adobe’s ‘Digital Marketing Global Partner of the Year, 2012’ – We are No. 4 in the last Ad Age ‘Agency A-List’ - the highest ranked digital agency. Page 3 © 2009 Razorfish. All rights reserved.
  • 4. RF – Atlas - Microsoft • Razorfish: Developed the ATLAS ad serving engine • Atlas was seperated from Razorfish, but had a symbiotic relationship • Google bought DoubleClick • Microsoft bought Aquantive Group • Microsoft incorporated Atlas into MS Advertising and Publishing • Sold Razorfish to Publicis group • RF continue to have a strong relationship with Atlas, but have gone on to develop Razorfish Edge, Insight On Demand (IoD), that use Atlas data extensively. Page 4 © 2009 Razorfish. All rights reserved.
  • 5. Atlas •Razorfish: Developed the ATLAS ad serving engine • Single cookie & atlas tags • 90% of Browsers • Clickstream analysis of data, current and historical, log file data. User are placed into buckets - segmented • Segmentation used to serve ads and cross sell Page 5 © 2009 Razorfish. All rights reserved.
  • 6. Problem 45 Terabytes of raw clickstream (log) data 45 Terabytes of raw clickstream and log data Business logic and metrics against loosely structured data • ROI • Custom ROI base on complex, client specific business rules • Rich Media and Analytics Custom user profiling Custom analysis of web surfing activity Targeting Page 6 © 2009 Razorfish. All rights reserved.
  • 7. Problem • Giant Datasets • Build infrastructure requires large continuous investment • Building for peak/holiday traffic • Data mining apps / Physical DB’s at or near limit • Client expectations/data volumes increasing Page 7 © 2009 Razorfish. All rights reserved.
  • 8. Previously 2009 •Custom Distributed Log Processing Engine • Sorted by cookie_id by time • Need to segment granularly across larger no/ segments (Cust || Prospect) •SQL • 60 SQL Server boxes • Shared resources (contention issues) • In a DR configuration •OLAP • In house constrained By the end of 2009 (x-mas holiday season), RF needed $500k to keep up with data processing needs. Page 8 © 2009 Razorfish. All rights reserved.
  • 9. AWS + EMR • Efficient: Elastic infrastructure from AWS allows capacity to be provisioned as needed based on load, reducing cost and the risk of processing delays. • Configuration: Amazon Elastic MapReduce and Cascading lets Razorfish focus on application development without having to worry about time-consuming set-up, management, or tuning of clusters or the compute capacity upon which they sit. • Ease of integration: Amazon Elastic MapReduce with Cascading allows data processing in the cloud without any changes to the underlying algorithms. • Flexible: EMR with Cascading is flexible enough to allow “agile” implementation and unit testing of sophisticated algorithms. • Adaptable: Cascading simplifies the integration of Hadoop with external ad systems. • Scalable: AWS infrastructure helps Razorfish reliably store and process huge (Petabytes) data sets. Page 9 © 2009 Razorfish. All rights reserved.
  • 10. AWS + EMR AWS EMR Segmentation •S3 Storage 45tb of log • Measurement of customer value • Actionable data • Measurement of customer affinity • Rules flexible / customizable • Joining 2.8 billion transactions against known site categorization information • Unbalanced so there is a hit to the reducers Page 10 © 2009 Razorfish. All rights reserved.
  • 11. We import a lot of Atlas Data 24 servers Cloud Storage Upload 200 + GB of data per day ( ½ Trillion ICA records )
  • 12. We filter out the relevant cookies Cloud Storage Elastic Mapreduce 100 Machine Cluster Created on demand. We filter for only the transactions that we need to process (more than 3.5 billion) ( about 71 million unique cookies a day)
  • 13. Filter by behavior Filtered Transactions SKU Table Generate list of products that have been seen ( Match these cookies to 100,000’s of skus )
  • 14. Match to their affinity Join transactions to site genre information Sport Enthusiast 70 million Filtered Transactions placements Determinee profile information by the types of sites the user has visited ( Cookies are matched to 3.5 billion ICA records )
  • 15. …and run custom business rules Join site behavior to SKU Table product info In market Gamer Filtered Transactions Determine the types of products the user is interested from what they have done on the site ( super–computing power determines some key categories )
  • 16. We bring it all together category affinity generation In market Gamer + Sport Enthusiast + Purchaser Home Theater ( 1 of N “Personalization” segments )
  • 17. Drive a personalized message User recently purchased a home theater system and is now looking for Target Ad sports games ( 1.7 million per day )
  • 18. Each and every day This all happens in about 8 hours every day ( not bad )
  • 19. AWS + EMR – Perfect clarity of cost – No upfront infrastructure investment – No client processing contention – We couldn’t have done it. – Without EMR/Hadoop process takes 3 days and heavy reliance on manual processes. Now 5-8hrs – Elasticity to complete a job faster if it’s worth the cost. – We can meet our SLA’s Page 19 © 2009 Razorfish. All rights reserved.
  • 20. Expanding Data Landscape • EMR allows us to deal with the ever expanding number of channels and user interactions with sites and data: • Clickstream data available from tools like Atlas and Doubleclick—who have cookied over 90% of the Internet • Digital experience tracked through tools like Omniture, Webtrends and Google Analytics • Other channel data across touchpoints (email, call center, mobile) • Client Data • Transactional data • Survey-based (Nielsen’s) • Social data available through open APIs (hosepipes) Page 20 © 2009 Razorfish. All rights reserved.
  • 21. Thank you •Mandhir Gidda © 2009 Razorfish. All rights reserved.