SlideShare uma empresa Scribd logo
1 de 64
Data mining for
causal inference
AMIT SHARMA
Postdoctoral Researcher, Microsoft Research
(Joint work with JAKE HOFMAN and DUNCAN
WATTS, Microsoft Research)
http://www.amitsharma.in
@amt_shrma
1
My research
Analyzing the effect of online systems
◦ Recommender systems [WWW ’13, EC ’15, CSCW ‘15]
◦ Social news feeds [CSCW ‘16]
◦ Web search
Methodological
◦ Threats to large-scale observational studies [WWW ’16b]
◦ Mining for natural experiments [EC ‘15]
◦ New identification strategies suited for fine-grained data
◦ Testing assumptions for validity of an instrumental variable
◦ Gaps between prediction and understanding [WWW ’16a, ICWSM ‘16]
2
How much do they
change user behavior?
4
Naively, up to 30% of traffic
comes from recommendations
5
Naively, up to 30% of traffic
comes from recommendations
“Burton Snowboard, a sports retailer, reported
that personalized product recommendations
have driven nearly 25% of total sales since it
began offering them in 2008. Prior to this,
Burton’s customer recommendations consisted
of items from its list of top-selling products.”
6
Example: product browsing on
Amazon.com
Example: product browsing on
Amazon.com
Example: product browsing on
Amazon.com
Counterfactual browsing: no
recommendations
Counterfactual browsing: no
recommendations
Problem: Correlated demand may
drive page visits, even without
recommendations
The problem of correlated
demand
Demand
for winter
accessories
Visits to
winter hat
Rec. visits
to winter
gloves
14
Goal: Estimate the causal
effect
Causal
Convenience
OBSERVED CLICK-THROUGHS WITHOUT RECOMMENDER
Convenience
?
15
Ideal experiment: A/B Test
Treatment (A) Control (B)
But, experiments:
may be costly
hamper user experience
require full access to the system
16
Using natural variations to
simulate an experiment
18
Studying sudden spikes,
“shocks” to demand for a book
[Carmi et al. 2012]
19
The same author’s recommended
book may also have a shock
20
Past work
Uses statistical models to control for confounds
Carmi et al. [2012], Oestreicher and Sundararajan [2012] and Lin [2013]
construct “complementary sets” of similar, non-recommended
products.
Garfinkel et. al. [2006] and Broder et al. [2015] compare to model-
predicted clicks without recommendations.
But,
1. These assumptions are hard to verify.
2. Finding examples of valid shocks requires ingenuity
and restricts researchers to very specific categories
21
This talk: Using data mining for
natural experiments
I. Data-driven instrumental variables
“Shock-IV” method: Mining for sudden spikes (“shocks”) in data
II. General data-driven identification strategy for
time series data
“Split-door” criterion: Generalizing the idea of shocks
Throughout, we will use Amazon’s recommendation system as an
example.
22
I. Shock-IV: Mining
for valid natural
experiments
23
Distinguishing between
recommendation and direct traffic
All visits to a
product
Recommender
visits
Direct visits
Search visits
Direct
browsing
Proxy for unobserved demand
24
The Shock-IV strategy:
Searching for valid shocks
? ?
25
The Shock-IV strategy: Filtering
out invalid shocks
26
Why does it work? Shock as an
instrumental variable
Demand
Focal
visits (X)
Rec.
visits (Y)
Sudden
Shock
Direct
visits (Y)
Computing the causal
estimate
Increase in
recommendation
clicks ( )
Causal CTR (
*Same as Wald estimator
for instrumental variables
Increase in
visits to focal
product ( )
Application to Amazon.com,
using Bing toolbar logs
•
•
•
Sept 2013-May 2014
Recreating sequence of page
visits by a user
Recreating sequence of page
visits by a user
Timestamp URL
2014-01-20
09:04:10
http://www.amazon.com/s/ref=nb_sb_nos
s_1?field-keywords=George%20saunders
2014-01-20
09:04:15
http://www.amazon.com/dp/0812984250/
ref=sr_1_1
2014-01-20
09:05:01
http://www.amazon.com/dp/1573225797/
ref=pd_sim_b_2
Recreating sequence of page
visits by a user
Timestamp URL
2014-01-20
09:04:10
http://www.amazon.com/s/ref=nb_sb_no
ss_1?field-keywords=George%20saunders
2014-01-20
09:04:15
http://www.amazon.com/dp/0812984250/
ref=sr_1_1
2014-01-20
09:05:01
http://www.amazon.com/dp/1573225797/
ref=pd_sim_b_2
User searches for
George Saunders
User clicks on the first
search result
User clicks on the
second recommendation
I. Weekly and seasonal patterns in
traffic, nearly tripling in holidays
II. 30% of all pageviews come
through recommendations
III. Books and eBooks are the
most popular categories by far
IV. Apparel and shoes see a
substantially higher fraction of
visits through recommendations
Shock-IV: Finding shocks in
user visit data
We look for focal products with large and sudden
increases in views relative to typical traffic.
Size of shock exceeds:
◦ 5 times median traffic
◦ Shock exceeds 5 times the previous day's traffic and 5 times the
mean of the last 7 days.
Shocked product has:
◦ Visits from at least 10 unique users during the shock
◦ Non-zero visits for at least five out of seven days before and after
the shock
38
Shock-IV: Ensuring exclusion
restriction
Recommended product (Y) should have constant
direct visits during the time of the shock.
(1-β): Ratio of maximum 14-day variation in visits to a
recommended product to the size of the shock for the focal
product.
Direct traffic to Y is
stable relative to
the shock to the
focal product.
β = 1 Direct traffic to Y is
no less varying
than the shock to
focal product.
β = 0
39
How to choose 𝛽?
Accept
RejectSelect 𝛽 = 0.7
Using the method, obtain
>4000 natural experiments!
Estimating the causal
clickthrough rate (𝜌)
Causal click-through rate by
product category
Estimating fraction of observed
click-throughs that are causal
Compare the number of estimated causal clicks to
all observed recommendation clicks (non-shock
period).
45
Only a quarter of the observed
click-throughs are causal
At β = 0.7, only 25% of
recommendation traffic is
caused by the recommender.
Generalization?
Shocks may be due to
discounts or sales
Lower CTR may be due to
the holiday season
47
Local average treatment effect
(LATE), not fully generalizable
Shocked products are not a representative sample of
all products, nor are the users who participate in them.
• Fortunately, Shock-IV method covers roughly one-fifth of
all products with at least 10 visits on any single day.
• Causal estimates are consistent with experimental
findings (e.g., Belluf et. al. [2012])
48
Summary: Shock-IV method
I. Mining for instruments allows us to study a much larger
sample of natural experiments.
II. Fine-grained data allowed us to test for exclusion
restriction directly.
A simple, scalable method for causal inference.
◦ Can used for improving recommender systems through causal metrics.
◦ Can be applied to other domains, such as online ads.
◦ Can be used for finding potential instruments.
49
II. Generalizing Shock-IV:
“Split-door” criterion
50
Let’s have a look at the model
again
Demand
Focal
visits (X)
Rec.
visits (Y)
Sudden
Shock
Direct
visits (Y)
Focal Product Recommended Product
Accept
Accept
54
The split-door criterion
Instead of searching for shocks,
Check whether direct traffic for Y is
independent of visits to X.
Demand
Focal
visits (X)
Rec.
visits (Y)
Direct
Visits
(YD)
55
More formal: Why does it
work?
Demand
Focal
visits (X)
Rec.
visits (Y)
Direct
Visits
(YD)
Two possibilities, both remove
the effect of common demand
Demand
Focal
visits (X)
Rec.
visits (Y)
Dir. visits
(YD)
Demand
Focal
visits (X)
Rec.
visits (Y)
Dir. visits
(YD)
Sidenote: Split-door criterion
generalizes Shock-IV
By capturing shocks, we were essentially capturing
notion of independence between X and 𝑌𝐷
Split-door will admit all valid shocks, as also other
variations.
58
Applying to logs from Amazon
recommendations
1.
2.
Summary: A general
identification criterion
Split-door criterion admits a broader sample of
natural experiments than shocks.
Automatically tests for valid identification. Can be
used whenever 𝑌𝑑 is separable.
Applications: Evaluate the relationship between
any two timeseries: e.g. social media and news, ads
and search.
61
Conclusion
Majority of traffic from recommendations may be
not causal, simply convenience.
Two data-driven methods:
• Shock-IV: An IV-based method for mining
exclusion-valid instruments from observational
data
• Split-door: A general identification strategy for
time series data.
62
More generally, data mining can
augment causal inference
methods
Hypothesize about a
natural variation
Argue why it resembles a
randomized experiment
Compute causal effect
Develop tests for
validity of natural
variation
Mine for such valid
variations in
observational data
Compute causal
effect
63
Thank you!
AMIT SHARMA
MICROSOFT RESEARCH
@amt_shrma http://www.amitsharma.in
Hypothesize about a
natural variation
Argue why it resembles a
randomized experiment
Compute causal effect
Develop tests for validity of
natural variation
Mine for such valid variations
in observational data
Compute causal effect
Sharma, A., Hofman, J. M., & Watts, D. J. (2015). Estimating the causal impact of
recommendation systems from observational data. In Proceedings of the Sixteenth ACM
Conference on Economics and Computation.
64

Mais conteúdo relacionado

Mais procurados

Learn how to do a conjoint analysis project in 1 hr
Learn how to do a conjoint analysis project in 1 hrLearn how to do a conjoint analysis project in 1 hr
Learn how to do a conjoint analysis project in 1 hr
QuestionPro
 

Mais procurados (20)

IRJET- Predicting Review Ratings for Product Marketing
IRJET- Predicting Review Ratings for Product MarketingIRJET- Predicting Review Ratings for Product Marketing
IRJET- Predicting Review Ratings for Product Marketing
 
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:Standard)Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:Standard)
 
1325 keynote kohavi
1325 keynote kohavi1325 keynote kohavi
1325 keynote kohavi
 
Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)
Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)
Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)
 
Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 2+ Factors)Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 2+ Factors)
 
Hypothesis Testing: Overview
Hypothesis Testing: OverviewHypothesis Testing: Overview
Hypothesis Testing: Overview
 
Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)
 
Hypothesis Testing: Proportions (Compare 1:Standard)
Hypothesis Testing: Proportions (Compare 1:Standard)Hypothesis Testing: Proportions (Compare 1:Standard)
Hypothesis Testing: Proportions (Compare 1:Standard)
 
Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Proportions (Compare 2+ Factors)Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Proportions (Compare 2+ Factors)
 
Hypothesis Testing: Relationships (Overview)
Hypothesis Testing: Relationships (Overview)Hypothesis Testing: Relationships (Overview)
Hypothesis Testing: Relationships (Overview)
 
Hypothesis Testing: Finding the Right Statistical Test
Hypothesis Testing: Finding the Right Statistical TestHypothesis Testing: Finding the Right Statistical Test
Hypothesis Testing: Finding the Right Statistical Test
 
Hypothesis Testing: Relationships (Compare 1:1)
Hypothesis Testing: Relationships (Compare 1:1)Hypothesis Testing: Relationships (Compare 1:1)
Hypothesis Testing: Relationships (Compare 1:1)
 
Hypothesis Testing: Proportions (Compare 1:1)
Hypothesis Testing: Proportions (Compare 1:1)Hypothesis Testing: Proportions (Compare 1:1)
Hypothesis Testing: Proportions (Compare 1:1)
 
Learn how to do a conjoint analysis project in 1 hr
Learn how to do a conjoint analysis project in 1 hrLearn how to do a conjoint analysis project in 1 hr
Learn how to do a conjoint analysis project in 1 hr
 
Into AB experiments
Into AB experimentsInto AB experiments
Into AB experiments
 
Hypothesis Testing: Statistical Laws and Confidence Intervals
Hypothesis Testing: Statistical Laws and Confidence IntervalsHypothesis Testing: Statistical Laws and Confidence Intervals
Hypothesis Testing: Statistical Laws and Confidence Intervals
 
Statistical Modeling in 3D: Describing, Explaining and Predicting
Statistical Modeling in 3D: Describing, Explaining and PredictingStatistical Modeling in 3D: Describing, Explaining and Predicting
Statistical Modeling in 3D: Describing, Explaining and Predicting
 
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
 
Building a Predictive Model
Building a Predictive ModelBuilding a Predictive Model
Building a Predictive Model
 
Hypothesis Testing: Spread (Compare 1:1)
Hypothesis Testing: Spread (Compare 1:1)Hypothesis Testing: Spread (Compare 1:1)
Hypothesis Testing: Spread (Compare 1:1)
 

Destaque

Causal inference in online systems: Methods, pitfalls and best practices
Causal inference in online systems: Methods, pitfalls and best practicesCausal inference in online systems: Methods, pitfalls and best practices
Causal inference in online systems: Methods, pitfalls and best practices
Amit Sharma
 
From prediction to causation: Causal inference in online systems
From prediction to causation: Causal inference in online systemsFrom prediction to causation: Causal inference in online systems
From prediction to causation: Causal inference in online systems
Amit Sharma
 
基于Storm分布式流计算的实时路况系统
基于Storm分布式流计算的实时路况系统基于Storm分布式流计算的实时路况系统
基于Storm分布式流计算的实时路况系统
GuangHua C
 
Fernando
FernandoFernando
Fernando
ERICCCK
 
Causal inference in practice: Here, there, causality is everywhere
Causal inference in practice: Here, there, causality is everywhereCausal inference in practice: Here, there, causality is everywhere
Causal inference in practice: Here, there, causality is everywhere
Amit Sharma
 
RSWEB 2013: A research platform for social recommendation
RSWEB 2013: A research platform for social recommendationRSWEB 2013: A research platform for social recommendation
RSWEB 2013: A research platform for social recommendation
Amit Sharma
 
Identidad y cultura nacional
Identidad y cultura nacionalIdentidad y cultura nacional
Identidad y cultura nacional
alfacfp
 
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...
GUANGYUAN PIAO
 

Destaque (15)

Causal inference in online systems: Methods, pitfalls and best practices
Causal inference in online systems: Methods, pitfalls and best practicesCausal inference in online systems: Methods, pitfalls and best practices
Causal inference in online systems: Methods, pitfalls and best practices
 
From prediction to causation: Causal inference in online systems
From prediction to causation: Causal inference in online systemsFrom prediction to causation: Causal inference in online systems
From prediction to causation: Causal inference in online systems
 
Causal inference in data science
Causal inference in data scienceCausal inference in data science
Causal inference in data science
 
Estimating influence of online activity feeds on people's actions
Estimating influence of online activity feeds on people's actionsEstimating influence of online activity feeds on people's actions
Estimating influence of online activity feeds on people's actions
 
基于Storm分布式流计算的实时路况系统
基于Storm分布式流计算的实时路况系统基于Storm分布式流计算的实时路况系统
基于Storm分布式流计算的实时路况系统
 
Fernando
FernandoFernando
Fernando
 
Causal inference in practice: Here, there, causality is everywhere
Causal inference in practice: Here, there, causality is everywhereCausal inference in practice: Here, there, causality is everywhere
Causal inference in practice: Here, there, causality is everywhere
 
RSWEB 2013: A research platform for social recommendation
RSWEB 2013: A research platform for social recommendationRSWEB 2013: A research platform for social recommendation
RSWEB 2013: A research platform for social recommendation
 
Identidad y cultura nacional
Identidad y cultura nacionalIdentidad y cultura nacional
Identidad y cultura nacional
 
UMAP2016EA - Analyzing MOOC Entries of Professionals on LinkedIn for User Mod...
UMAP2016EA - Analyzing MOOC Entries of Professionals on LinkedIn for User Mod...UMAP2016EA - Analyzing MOOC Entries of Professionals on LinkedIn for User Mod...
UMAP2016EA - Analyzing MOOC Entries of Professionals on LinkedIn for User Mod...
 
Auditing search engines for differential satisfaction across demographics
Auditing search engines for differential satisfaction across demographicsAuditing search engines for differential satisfaction across demographics
Auditing search engines for differential satisfaction across demographics
 
UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...
UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...
UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...
 
5.3.5 causal inference in research
5.3.5 causal inference in research5.3.5 causal inference in research
5.3.5 causal inference in research
 
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...
 
20170227 파이썬으로 챗봇_만들기
20170227 파이썬으로 챗봇_만들기20170227 파이썬으로 챗봇_만들기
20170227 파이썬으로 챗봇_만들기
 

Semelhante a Data mining for causal inference: Effect of recommendations on Amazon.com

How to predict the future of shopping - Ulrich Kerzel @ PAPIs Connect
How to predict the future of shopping - Ulrich Kerzel @ PAPIs ConnectHow to predict the future of shopping - Ulrich Kerzel @ PAPIs Connect
How to predict the future of shopping - Ulrich Kerzel @ PAPIs Connect
PAPIs.io
 
Lean Startup Metrics & Analytics
Lean Startup Metrics & AnalyticsLean Startup Metrics & Analytics
Lean Startup Metrics & Analytics
Nicola Junior Vitto
 
Where Flow Charts Don’t Go -- Website Security Statistics Report (2015)
Where Flow Charts Don’t Go -- Website Security Statistics Report (2015)Where Flow Charts Don’t Go -- Website Security Statistics Report (2015)
Where Flow Charts Don’t Go -- Website Security Statistics Report (2015)
Jeremiah Grossman
 
In a World of Biased Search Engines
In a World of Biased Search EnginesIn a World of Biased Search Engines
In a World of Biased Search Engines
Dirk Lewandowski
 
Avcomparatives Survey 2011
Avcomparatives Survey 2011Avcomparatives Survey 2011
Avcomparatives Survey 2011
Anatoliy Tkachev
 
IAB Rising Stars Study - January 30th 2015 - Topline Report
IAB Rising Stars Study - January 30th 2015 - Topline ReportIAB Rising Stars Study - January 30th 2015 - Topline Report
IAB Rising Stars Study - January 30th 2015 - Topline Report
Romain Fonnier
 

Semelhante a Data mining for causal inference: Effect of recommendations on Amazon.com (20)

Is Crowd Testing (relevant) for Software Engineers?
Is Crowd Testing (relevant) for Software Engineers?Is Crowd Testing (relevant) for Software Engineers?
Is Crowd Testing (relevant) for Software Engineers?
 
XPLODIV: An Exploitation-Exploration Aware Diversification Approach for Recom...
XPLODIV: An Exploitation-Exploration Aware Diversification Approach for Recom...XPLODIV: An Exploitation-Exploration Aware Diversification Approach for Recom...
XPLODIV: An Exploitation-Exploration Aware Diversification Approach for Recom...
 
How to predict the future of shopping - Ulrich Kerzel @ PAPIs Connect
How to predict the future of shopping - Ulrich Kerzel @ PAPIs ConnectHow to predict the future of shopping - Ulrich Kerzel @ PAPIs Connect
How to predict the future of shopping - Ulrich Kerzel @ PAPIs Connect
 
Lean Startup Metrics & Analytics
Lean Startup Metrics & AnalyticsLean Startup Metrics & Analytics
Lean Startup Metrics & Analytics
 
Competitive Intelligence Analysis
Competitive Intelligence AnalysisCompetitive Intelligence Analysis
Competitive Intelligence Analysis
 
Where Flow Charts Don’t Go -- Website Security Statistics Report (2015)
Where Flow Charts Don’t Go -- Website Security Statistics Report (2015)Where Flow Charts Don’t Go -- Website Security Statistics Report (2015)
Where Flow Charts Don’t Go -- Website Security Statistics Report (2015)
 
In a World of Biased Search Engines
In a World of Biased Search EnginesIn a World of Biased Search Engines
In a World of Biased Search Engines
 
230286802015PPT.pptx
230286802015PPT.pptx230286802015PPT.pptx
230286802015PPT.pptx
 
Conversion Whitepaper
Conversion WhitepaperConversion Whitepaper
Conversion Whitepaper
 
Avcomparatives Survey 2011
Avcomparatives Survey 2011Avcomparatives Survey 2011
Avcomparatives Survey 2011
 
Implementing Crowdsourced Testing
Implementing Crowdsourced TestingImplementing Crowdsourced Testing
Implementing Crowdsourced Testing
 
IAB Rising Stars Study - January 30th 2015 - Topline Report
IAB Rising Stars Study - January 30th 2015 - Topline ReportIAB Rising Stars Study - January 30th 2015 - Topline Report
IAB Rising Stars Study - January 30th 2015 - Topline Report
 
When in doubt, go live
When in doubt, go liveWhen in doubt, go live
When in doubt, go live
 
Digital analytics: Optimization (Lecture 10)
Digital analytics: Optimization (Lecture 10)Digital analytics: Optimization (Lecture 10)
Digital analytics: Optimization (Lecture 10)
 
eyeDemand "Demystifying RTB: Keys to a Successful Campaign"
eyeDemand "Demystifying RTB: Keys to a Successful Campaign"eyeDemand "Demystifying RTB: Keys to a Successful Campaign"
eyeDemand "Demystifying RTB: Keys to a Successful Campaign"
 
CrikeyCon 2017 - Rumours of our Demise Have Been Greatly Exaggerated
CrikeyCon 2017  - Rumours of our Demise Have Been Greatly ExaggeratedCrikeyCon 2017  - Rumours of our Demise Have Been Greatly Exaggerated
CrikeyCon 2017 - Rumours of our Demise Have Been Greatly Exaggerated
 
7 Bug Bounty Myths, BUSTED
7 Bug Bounty Myths, BUSTED7 Bug Bounty Myths, BUSTED
7 Bug Bounty Myths, BUSTED
 
Big-O(Q) VLDB 2015 Keynote: Social Network Analytics: Beyond the Obvious
Big-O(Q) VLDB 2015 Keynote: Social Network Analytics: Beyond the ObviousBig-O(Q) VLDB 2015 Keynote: Social Network Analytics: Beyond the Obvious
Big-O(Q) VLDB 2015 Keynote: Social Network Analytics: Beyond the Obvious
 
Big-O(Q) Social Network Analytics
Big-O(Q) Social Network AnalyticsBig-O(Q) Social Network Analytics
Big-O(Q) Social Network Analytics
 
The Conversion Optimization System: A 12-step Process to Create Repeatable, S...
The Conversion Optimization System: A 12-step Process to Create Repeatable, S...The Conversion Optimization System: A 12-step Process to Create Repeatable, S...
The Conversion Optimization System: A 12-step Process to Create Repeatable, S...
 

Mais de Amit Sharma

The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...
Amit Sharma
 
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
Amit Sharma
 

Mais de Amit Sharma (6)

Dowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceDowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inference
 
Artificial Intelligence for Societal Impact
Artificial Intelligence for Societal ImpactArtificial Intelligence for Societal Impact
Artificial Intelligence for Societal Impact
 
Equivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
Equivalence causal frameworks: SEMs, Graphical models and Potential OutcomesEquivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
Equivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
 
The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...
 
The role of social connections in shaping our preferences
The role of social connections in shaping our preferencesThe role of social connections in shaping our preferences
The role of social connections in shaping our preferences
 
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
 

Último

Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
ptikerjasaptiker
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
vexqp
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
q6pzkpark
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
wsppdmt
 

Último (20)

Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
 

Data mining for causal inference: Effect of recommendations on Amazon.com

  • 1. Data mining for causal inference AMIT SHARMA Postdoctoral Researcher, Microsoft Research (Joint work with JAKE HOFMAN and DUNCAN WATTS, Microsoft Research) http://www.amitsharma.in @amt_shrma 1
  • 2. My research Analyzing the effect of online systems ◦ Recommender systems [WWW ’13, EC ’15, CSCW ‘15] ◦ Social news feeds [CSCW ‘16] ◦ Web search Methodological ◦ Threats to large-scale observational studies [WWW ’16b] ◦ Mining for natural experiments [EC ‘15] ◦ New identification strategies suited for fine-grained data ◦ Testing assumptions for validity of an instrumental variable ◦ Gaps between prediction and understanding [WWW ’16a, ICWSM ‘16] 2
  • 3.
  • 4. How much do they change user behavior? 4
  • 5. Naively, up to 30% of traffic comes from recommendations 5
  • 6. Naively, up to 30% of traffic comes from recommendations “Burton Snowboard, a sports retailer, reported that personalized product recommendations have driven nearly 25% of total sales since it began offering them in 2008. Prior to this, Burton’s customer recommendations consisted of items from its list of top-selling products.” 6
  • 7.
  • 10. Example: product browsing on Amazon.com
  • 13. Problem: Correlated demand may drive page visits, even without recommendations
  • 14. The problem of correlated demand Demand for winter accessories Visits to winter hat Rec. visits to winter gloves 14
  • 15. Goal: Estimate the causal effect Causal Convenience OBSERVED CLICK-THROUGHS WITHOUT RECOMMENDER Convenience ? 15
  • 16. Ideal experiment: A/B Test Treatment (A) Control (B) But, experiments: may be costly hamper user experience require full access to the system 16
  • 17.
  • 18. Using natural variations to simulate an experiment 18
  • 19. Studying sudden spikes, “shocks” to demand for a book [Carmi et al. 2012] 19
  • 20. The same author’s recommended book may also have a shock 20
  • 21. Past work Uses statistical models to control for confounds Carmi et al. [2012], Oestreicher and Sundararajan [2012] and Lin [2013] construct “complementary sets” of similar, non-recommended products. Garfinkel et. al. [2006] and Broder et al. [2015] compare to model- predicted clicks without recommendations. But, 1. These assumptions are hard to verify. 2. Finding examples of valid shocks requires ingenuity and restricts researchers to very specific categories 21
  • 22. This talk: Using data mining for natural experiments I. Data-driven instrumental variables “Shock-IV” method: Mining for sudden spikes (“shocks”) in data II. General data-driven identification strategy for time series data “Split-door” criterion: Generalizing the idea of shocks Throughout, we will use Amazon’s recommendation system as an example. 22
  • 23. I. Shock-IV: Mining for valid natural experiments 23
  • 24. Distinguishing between recommendation and direct traffic All visits to a product Recommender visits Direct visits Search visits Direct browsing Proxy for unobserved demand 24
  • 25. The Shock-IV strategy: Searching for valid shocks ? ? 25
  • 26. The Shock-IV strategy: Filtering out invalid shocks 26
  • 27.
  • 28. Why does it work? Shock as an instrumental variable Demand Focal visits (X) Rec. visits (Y) Sudden Shock Direct visits (Y)
  • 29. Computing the causal estimate Increase in recommendation clicks ( ) Causal CTR ( *Same as Wald estimator for instrumental variables Increase in visits to focal product ( )
  • 30. Application to Amazon.com, using Bing toolbar logs • • • Sept 2013-May 2014
  • 31. Recreating sequence of page visits by a user
  • 32. Recreating sequence of page visits by a user Timestamp URL 2014-01-20 09:04:10 http://www.amazon.com/s/ref=nb_sb_nos s_1?field-keywords=George%20saunders 2014-01-20 09:04:15 http://www.amazon.com/dp/0812984250/ ref=sr_1_1 2014-01-20 09:05:01 http://www.amazon.com/dp/1573225797/ ref=pd_sim_b_2
  • 33. Recreating sequence of page visits by a user Timestamp URL 2014-01-20 09:04:10 http://www.amazon.com/s/ref=nb_sb_no ss_1?field-keywords=George%20saunders 2014-01-20 09:04:15 http://www.amazon.com/dp/0812984250/ ref=sr_1_1 2014-01-20 09:05:01 http://www.amazon.com/dp/1573225797/ ref=pd_sim_b_2 User searches for George Saunders User clicks on the first search result User clicks on the second recommendation
  • 34. I. Weekly and seasonal patterns in traffic, nearly tripling in holidays
  • 35. II. 30% of all pageviews come through recommendations
  • 36. III. Books and eBooks are the most popular categories by far
  • 37. IV. Apparel and shoes see a substantially higher fraction of visits through recommendations
  • 38. Shock-IV: Finding shocks in user visit data We look for focal products with large and sudden increases in views relative to typical traffic. Size of shock exceeds: ◦ 5 times median traffic ◦ Shock exceeds 5 times the previous day's traffic and 5 times the mean of the last 7 days. Shocked product has: ◦ Visits from at least 10 unique users during the shock ◦ Non-zero visits for at least five out of seven days before and after the shock 38
  • 39. Shock-IV: Ensuring exclusion restriction Recommended product (Y) should have constant direct visits during the time of the shock. (1-β): Ratio of maximum 14-day variation in visits to a recommended product to the size of the shock for the focal product. Direct traffic to Y is stable relative to the shock to the focal product. β = 1 Direct traffic to Y is no less varying than the shock to focal product. β = 0 39
  • 40. How to choose 𝛽? Accept RejectSelect 𝛽 = 0.7
  • 41. Using the method, obtain >4000 natural experiments!
  • 43. Causal click-through rate by product category
  • 44.
  • 45. Estimating fraction of observed click-throughs that are causal Compare the number of estimated causal clicks to all observed recommendation clicks (non-shock period). 45
  • 46. Only a quarter of the observed click-throughs are causal At β = 0.7, only 25% of recommendation traffic is caused by the recommender.
  • 47. Generalization? Shocks may be due to discounts or sales Lower CTR may be due to the holiday season 47
  • 48. Local average treatment effect (LATE), not fully generalizable Shocked products are not a representative sample of all products, nor are the users who participate in them. • Fortunately, Shock-IV method covers roughly one-fifth of all products with at least 10 visits on any single day. • Causal estimates are consistent with experimental findings (e.g., Belluf et. al. [2012]) 48
  • 49. Summary: Shock-IV method I. Mining for instruments allows us to study a much larger sample of natural experiments. II. Fine-grained data allowed us to test for exclusion restriction directly. A simple, scalable method for causal inference. ◦ Can used for improving recommender systems through causal metrics. ◦ Can be applied to other domains, such as online ads. ◦ Can be used for finding potential instruments. 49
  • 51.
  • 52. Let’s have a look at the model again Demand Focal visits (X) Rec. visits (Y) Sudden Shock Direct visits (Y)
  • 53.
  • 54. Focal Product Recommended Product Accept Accept 54
  • 55. The split-door criterion Instead of searching for shocks, Check whether direct traffic for Y is independent of visits to X. Demand Focal visits (X) Rec. visits (Y) Direct Visits (YD) 55
  • 56. More formal: Why does it work? Demand Focal visits (X) Rec. visits (Y) Direct Visits (YD)
  • 57. Two possibilities, both remove the effect of common demand Demand Focal visits (X) Rec. visits (Y) Dir. visits (YD) Demand Focal visits (X) Rec. visits (Y) Dir. visits (YD)
  • 58. Sidenote: Split-door criterion generalizes Shock-IV By capturing shocks, we were essentially capturing notion of independence between X and 𝑌𝐷 Split-door will admit all valid shocks, as also other variations. 58
  • 59. Applying to logs from Amazon recommendations 1. 2.
  • 60.
  • 61. Summary: A general identification criterion Split-door criterion admits a broader sample of natural experiments than shocks. Automatically tests for valid identification. Can be used whenever 𝑌𝑑 is separable. Applications: Evaluate the relationship between any two timeseries: e.g. social media and news, ads and search. 61
  • 62. Conclusion Majority of traffic from recommendations may be not causal, simply convenience. Two data-driven methods: • Shock-IV: An IV-based method for mining exclusion-valid instruments from observational data • Split-door: A general identification strategy for time series data. 62
  • 63. More generally, data mining can augment causal inference methods Hypothesize about a natural variation Argue why it resembles a randomized experiment Compute causal effect Develop tests for validity of natural variation Mine for such valid variations in observational data Compute causal effect 63
  • 64. Thank you! AMIT SHARMA MICROSOFT RESEARCH @amt_shrma http://www.amitsharma.in Hypothesize about a natural variation Argue why it resembles a randomized experiment Compute causal effect Develop tests for validity of natural variation Mine for such valid variations in observational data Compute causal effect Sharma, A., Hofman, J. M., & Watts, D. J. (2015). Estimating the causal impact of recommendation systems from observational data. In Proceedings of the Sixteenth ACM Conference on Economics and Computation. 64