SlideShare a Scribd company logo
1 of 45
Here, there,
causality is
everywhere
AMIT SHARMA, MICROSOFT RESEARCH
http://www.amitsharma.in
@amt_shrma
My route to causality
Building
recommender
systems in social
networks
Conducting user
experiments
Estimating impact of
recommendations
and social feeds
Causality is everywhere
Spans every branch of science.
◦ Economics
◦ Political science
◦ Study of human behavior
◦ Biology and medicine
◦ Computer science (?)
Spans centuries of thought.
◦ Aristotle: “To know, is to know the final cause.”
Took us until 1930s to come up with the randomized experiment
(Fisher).
Still early days for estimating causal effects from observational data.
Causality in economics
David Card. The causal effect of education on earnings (1999)
Conley and Heerwig. The Long-Term Effects of Military
Conscription on Mortality: Estimates From the Vietnam-Era Draft
Lottery (2012)
Causality in political science
Darrell West. Air Wars (2013)
Chattopadhyay and Duflo. Women as Policy Makers:
Evidence from a Randomized Policy Experiment in
India (2004)
Causality in human behavior
Thistlewaithe and Campbell. Effect of public recognition
of scholastic achievement (1960)
Christakis and Fowler. The collective dynamics
of smoking in a large social network (2008)
Causality in biology and
medicine
Effect of Vitamin D deficiency on colon cancer
Effect of heart attack surgery on long-term
health of patient
Causality in web applications
Sharma and Cosley. Distinguishing between personal preference
and homophily in online activity feeds (2016).
Sharma, Hofman and Watts. Estimating the causal impact of
recommender systems (2015).
Counterfactual reasoning
Correlation question: How well can X predict Y?
◦ Machine learning, Statistical estimation.
Interventionist question: If X is changed to X’, what will be
the value of Y?
◦ Experiments, Reinforcement learning, Contextual bandits.
Counterfactual question: If X would have been X’, what
would be the value of Y?
◦ Today’s focus.
Estimating causal effects from
observational data
Why is causal inference hard?
◦ Simpson’s paradox
The language of graphical models
◦ Backdoor criterion
◦ Frontdoor criterion
Common approaches for causal inference
◦ Conditioning
◦ Mechanism-based
◦ Natural Experiments
Example: Estimating causal impact of recommender systems
Estimating the effectiveness of
kidney stone treatment
Treatment A Treatment B
Small stones 93% (81/87) 87% (234/270)
Large stones 73% (192/263) 69% (55/80)
Both 78% (273/350) 83% (289/350)
Julious and Mullee. Confounding and Simpson’s Paradox (1994).
http://en.wikipedia.org/wiki/Simpson’s_paradox
Two treatments for kidney stones
Treatment A : 78% effective
Treatment B : 83% effective
Estimating ad placement on a
search engine
Suppose we would like to optimize the set of ads shown for a query,
rather than optimize inidividually.
Click probability
estimates: q1, q2
Does q2 depend on
q1?
1st, q1
2nd, q2
Confounders in ad placement
Let us define two groups with 2000 queries each:
◦ High q1: (149/2000) CTR on second ad
◦ Low q1: (124/2000) CTR on second ad
Low q1 High q1
Low q2 5.1% (92/1823) 4.8% (71/1500)
High q2 18.1% (32/176) 15.6% (78/500)
Both 6.2% (124/2000) 7.5% (149/2000)
Bottou et al. Counterfactual reasoning and learning systems
(2013).
Causal graphical models: a
framework for causality
Structural equation modeling (SEM)
X = q1
Y = CTR on second ad
Which variables to condition
on?
Observed variables
◦ Which observed variables?
◦ As we will see, observing on all variables may not be correct.
Known unknowns:
◦ Age, Past diseases, Food intake
Unknown unknowns:
◦ What else could impact recovery from kidney stones?
◦ Genetic markers?
Which variables to condition
on?
Connections to Bayesian
networks
Markov assumption: Probability of an effect is independent of
everything else given its direct causes.
Two
approaches:
--Backdoor
criterion
--Frontdoor
criterion
Graphical Models and common
methods for causal estimation
Condition on
observed covariates
• Stratification
• Matching
• Regression (?)
Mechanism-based
strategies
• Path-based
approaches
Natural experiments
• As-if experiments
• Instrumental
Variables
• Regression
discontinuity
I. Conditioning on observed
covariates
Corresponds to Backdoor criterion.
a) Stratification
Condition on different levels of socio-
economic status.
b) Matching
Socio-Economic status is a function of parents’ income,
locality and other observed indicators.
b) Matching
Model propensity to attend a particular school.
Pschool = f(PI, Loc, …)
c) Regression
Condition on observed covariates by
adding them as independent variables
in regression.
Works only if true causal
relationship between
variables is linear.
II. Mechanism-based
strategies
Corresponds to Front door criterion.
III. Natural Experiments
Look for experiments happening in the real world.
Promise greater generalizability than controlled lab experiments.
Require greater care to ensure validity of causal identification.
a. (As-if) random
experiments
b) Regression discontinuity
c) Instrumental variables
Shock!
Increase in
traffic
Summary: Two graphical criteria
explain all of conventional
approaches
A principled, succinct framework for causality.
Allows arbitrary functional forms for relationships between variables.
Leads to clear statements about causal assumptions.
If a causal effect can be identified, it can be derived using do-calculus
(helpful for bigger graphs).
Product
recommendati
ons on Amazon
Do recommendations expose
people to new products?
Do recommendations lead to
more purchases?
Counterfactual
reasoning
What would have
happened in case there
were no
recommendations?
X = Activity on current item that the user is
viewing
Y = Activity on the recommended Item
UX = Latent properties of X
UY = Latent Properties of Y
Why is
estimating
effects of
recommenda
tions difficult
using
observational
data?
If latent properties for X and Y
are correlated, then observed
changes in AY cannot be
directly attributed to AX.
AX AY
UYUX
A causal graphical model for the impact of recommendations
(ref. Pearl 09)
AX = Visits on a product X on Amazon
AY = Recommendation click-throughs from X
to Y
UX = Consumer demand for X
UY = Consumer demand for Y
If latent
properties for X
and Y are
correlated, then
observed
changes in AY
cannot be
directly
attributed to AX.
AX AY
UYUX
A causal graphical model for the impact of recommendations
Example:
Looking for a
machine
learning book
Observed clickthrough data
due to recommendations do
not tell the full story.
For example, let’s assume I just
completed the Artificial
Intelligence book by Russell
and Norvig and now I want to
learn more about machine
learning.
Xi: Focal Product
Yj: Recommended Products
Xi: Focal Product
Yj: Recommended Products
Causal
Link
Convenience
Link
Revisi
t Link
Waste
d Link
There could be also be irrelevant links.
The Shock strategy (I.V.)
If direct visits to product Yj are nearly constant, then we
can assume that the convenience clicks to Yj will be
nearly constant.
Thus,
The Shock strategy
We cannot say much during normal traffic for a product. But if a product experiences a spike in
visits and its recommended product does not, then we can demonstrate a method to compute
the causal clickthrough rate.
Data description
Dataset: Anonymized Amazon URL log data from Bing toolbar for opted-in
users.
Eight months (Sept. 1 2013 to May 31 2014).
URL structure allows us to determine:
◦ Type of page visited (product, search, cart, bestsellers, wishlist)
◦ Type of referral to a product (recommendation, search, none, others)
After filtering out bots, sellers, authors, publishers and unpopular products (<5
visits):
◦ Number of products = 1.38 M
◦ Number of users = 2.1M
◦ 60 product categories (such as Books, Toys, Electronics)
Implementing
the strategy:
The shock
criteria
Large: Visits during a shock
must exceed 5 times the
median traffic for a product
Sudden: Visits during a
shock must be 5 times the
last day’s traffic and 5 times
the last week’s traffic
Sane: Visits from at least 10
unique users and on 5
different days before and
after a shock
4776
shocks to
4126
products
Implementing the strategy: The shock
criteria
Additionally, we want direct visits to Yj be constant. Maximum change in direct visits to Yj should not bigger
than the size of the shock.
When beta=1, ideally causal. When beta=1, all bets are off.
Good shock Bad shock (filtered out at beta=0.7)
Results:
Fraction of
causal
clickthroughs
by category
Majority of the
clickthroughs are due to
convenience.
Within any category, 5% or
lower is a more accurate
estimate of clickthroughs
caused by
recommendations.
Robustness checks
Shocks may not be representative
◦ Distribution of users, popularity and the affinity between users and products
does not see much difference (except that shocked products are, on average,
more popular).
Shocks may be caused by deals which make the focal product more
attractive
◦ Verification using referrals from log data (e.g. bookbub.com) and manual
inspection of past prices (from camelcamelcamel.com)
Shocks may be a property of the weird holiday season.
◦ They occur throughout the data, although with more frequency during the
holidays.
Graphical models form a succinct,
sound and complete framework
for reasoning about causality.
They can also be practical.
THANK YOU!
AMIT SHARMA, MICROSOFT RESEARCH
http://www.amitsharma.in
@amt_shrma

More Related Content

What's hot

Measures of association
Measures of associationMeasures of association
Measures of associationIAU Dent
 
09 Inference for Networks – Exponential Random Graph Models (2017)
09 Inference for Networks – Exponential Random Graph Models (2017)09 Inference for Networks – Exponential Random Graph Models (2017)
09 Inference for Networks – Exponential Random Graph Models (2017)Duke Network Analysis Center
 
CounterFactual Explanations.pdf
CounterFactual Explanations.pdfCounterFactual Explanations.pdf
CounterFactual Explanations.pdfBong-Ho Lee
 
Explainability and bias in AI
Explainability and bias in AIExplainability and bias in AI
Explainability and bias in AIBill Liu
 
Machine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series PredictionMachine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series PredictionGianluca Bontempi
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spssDr Nisha Arora
 
Why start using uplift models for more efficient marketing campaigns
Why start using uplift models for more efficient marketing campaignsWhy start using uplift models for more efficient marketing campaigns
Why start using uplift models for more efficient marketing campaignsData Con LA
 
Fairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsFairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsKrishnaram Kenthapadi
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsJustin Basilico
 
Machine Learning Interpretability
Machine Learning InterpretabilityMachine Learning Interpretability
Machine Learning Interpretabilityinovex GmbH
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionDrZahid Khan
 
20140602 statistical power - husnul and nur
20140602   statistical power - husnul and nur20140602   statistical power - husnul and nur
20140602 statistical power - husnul and nurMuhammad Khuluq
 
Algorithmic Impact Assessment: Fairness, Robustness and Explainability in Aut...
Algorithmic Impact Assessment: Fairness, Robustness and Explainability in Aut...Algorithmic Impact Assessment: Fairness, Robustness and Explainability in Aut...
Algorithmic Impact Assessment: Fairness, Robustness and Explainability in Aut...Adriano Soares Koshiyama
 
Machine Learning for Survival Analysis
Machine Learning for Survival AnalysisMachine Learning for Survival Analysis
Machine Learning for Survival AnalysisChandan Reddy
 
Machine Learning Interpretability / Explainability
Machine Learning Interpretability / ExplainabilityMachine Learning Interpretability / Explainability
Machine Learning Interpretability / ExplainabilityRaouf KESKES
 
Structural equation-models-introduction-kimmo-vehkalahti-2013
Structural equation-models-introduction-kimmo-vehkalahti-2013Structural equation-models-introduction-kimmo-vehkalahti-2013
Structural equation-models-introduction-kimmo-vehkalahti-2013Kimmo Vehkalahti
 
How to correctly estimate the effect of online advertisement(About Double Mac...
How to correctly estimate the effect of online advertisement(About Double Mac...How to correctly estimate the effect of online advertisement(About Double Mac...
How to correctly estimate the effect of online advertisement(About Double Mac...Yusuke Kaneko
 

What's hot (20)

Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Measures of association
Measures of associationMeasures of association
Measures of association
 
09 Inference for Networks – Exponential Random Graph Models (2017)
09 Inference for Networks – Exponential Random Graph Models (2017)09 Inference for Networks – Exponential Random Graph Models (2017)
09 Inference for Networks – Exponential Random Graph Models (2017)
 
CounterFactual Explanations.pdf
CounterFactual Explanations.pdfCounterFactual Explanations.pdf
CounterFactual Explanations.pdf
 
Explainability and bias in AI
Explainability and bias in AIExplainability and bias in AI
Explainability and bias in AI
 
Machine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series PredictionMachine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series Prediction
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spss
 
Why start using uplift models for more efficient marketing campaigns
Why start using uplift models for more efficient marketing campaignsWhy start using uplift models for more efficient marketing campaigns
Why start using uplift models for more efficient marketing campaigns
 
Fairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsFairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML Systems
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender Systems
 
Machine Learning Interpretability
Machine Learning InterpretabilityMachine Learning Interpretability
Machine Learning Interpretability
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
20140602 statistical power - husnul and nur
20140602   statistical power - husnul and nur20140602   statistical power - husnul and nur
20140602 statistical power - husnul and nur
 
Algorithmic Impact Assessment: Fairness, Robustness and Explainability in Aut...
Algorithmic Impact Assessment: Fairness, Robustness and Explainability in Aut...Algorithmic Impact Assessment: Fairness, Robustness and Explainability in Aut...
Algorithmic Impact Assessment: Fairness, Robustness and Explainability in Aut...
 
Machine Learning for Survival Analysis
Machine Learning for Survival AnalysisMachine Learning for Survival Analysis
Machine Learning for Survival Analysis
 
Epidemiology Study Design
Epidemiology Study DesignEpidemiology Study Design
Epidemiology Study Design
 
Machine Learning Interpretability / Explainability
Machine Learning Interpretability / ExplainabilityMachine Learning Interpretability / Explainability
Machine Learning Interpretability / Explainability
 
Structural equation-models-introduction-kimmo-vehkalahti-2013
Structural equation-models-introduction-kimmo-vehkalahti-2013Structural equation-models-introduction-kimmo-vehkalahti-2013
Structural equation-models-introduction-kimmo-vehkalahti-2013
 
Regression
RegressionRegression
Regression
 
How to correctly estimate the effect of online advertisement(About Double Mac...
How to correctly estimate the effect of online advertisement(About Double Mac...How to correctly estimate the effect of online advertisement(About Double Mac...
How to correctly estimate the effect of online advertisement(About Double Mac...
 

Viewers also liked

Causal inference in online systems: Methods, pitfalls and best practices
Causal inference in online systems: Methods, pitfalls and best practicesCausal inference in online systems: Methods, pitfalls and best practices
Causal inference in online systems: Methods, pitfalls and best practicesAmit Sharma
 
Data mining for causal inference: Effect of recommendations on Amazon.com
Data mining for causal inference: Effect of recommendations on Amazon.comData mining for causal inference: Effect of recommendations on Amazon.com
Data mining for causal inference: Effect of recommendations on Amazon.comAmit Sharma
 
From prediction to causation: Causal inference in online systems
From prediction to causation: Causal inference in online systemsFrom prediction to causation: Causal inference in online systems
From prediction to causation: Causal inference in online systemsAmit Sharma
 
Agenda 29 de febrero al 04 de marzo (2)
Agenda 29 de febrero al 04 de marzo (2)Agenda 29 de febrero al 04 de marzo (2)
Agenda 29 de febrero al 04 de marzo (2)colegiommc
 
11 al 15 de julio
11 al 15 de julio11 al 15 de julio
11 al 15 de juliocolegiommc
 
Agenda 29 de febrero al 04 de marzo
Agenda 29 de febrero al 04 de marzoAgenda 29 de febrero al 04 de marzo
Agenda 29 de febrero al 04 de marzocolegiommc
 
типы химических связей
типы химических связейтипы химических связей
типы химических связейOlga Pishchik
 
Logistica elecciones 2014
Logistica elecciones 2014Logistica elecciones 2014
Logistica elecciones 2014colegiommc
 
Обзор периодической печати колледжа.
Обзор периодической печати колледжа.Обзор периодической печати колледжа.
Обзор периодической печати колледжа.Димка Куликов
 
бенефис почтенной книге
бенефис почтенной книгебенефис почтенной книге
бенефис почтенной книгеДимка Куликов
 
The role of social connections in shaping our preferences
The role of social connections in shaping our preferencesThe role of social connections in shaping our preferences
The role of social connections in shaping our preferencesAmit Sharma
 
Методическое пособие по всем видам работ.
Методическое пособие по всем видам работ. Методическое пособие по всем видам работ.
Методическое пособие по всем видам работ. Димка Куликов
 
Auditing search engines for differential satisfaction across demographics
Auditing search engines for differential satisfaction across demographicsAuditing search engines for differential satisfaction across demographics
Auditing search engines for differential satisfaction across demographicsAmit Sharma
 
фотоотчет о проведении акции молодежь против туберкулеза
фотоотчет о проведении акции молодежь против туберкулезафотоотчет о проведении акции молодежь против туберкулеза
фотоотчет о проведении акции молодежь против туберкулезаДимка Куликов
 
From Excel to PowerPoint - Logos and Icons
From Excel to PowerPoint - Logos and IconsFrom Excel to PowerPoint - Logos and Icons
From Excel to PowerPoint - Logos and IconsOfficeReports
 

Viewers also liked (20)

Causal inference in online systems: Methods, pitfalls and best practices
Causal inference in online systems: Methods, pitfalls and best practicesCausal inference in online systems: Methods, pitfalls and best practices
Causal inference in online systems: Methods, pitfalls and best practices
 
Data mining for causal inference: Effect of recommendations on Amazon.com
Data mining for causal inference: Effect of recommendations on Amazon.comData mining for causal inference: Effect of recommendations on Amazon.com
Data mining for causal inference: Effect of recommendations on Amazon.com
 
From prediction to causation: Causal inference in online systems
From prediction to causation: Causal inference in online systemsFrom prediction to causation: Causal inference in online systems
From prediction to causation: Causal inference in online systems
 
Semana 24
Semana 24Semana 24
Semana 24
 
Agenda 29 de febrero al 04 de marzo (2)
Agenda 29 de febrero al 04 de marzo (2)Agenda 29 de febrero al 04 de marzo (2)
Agenda 29 de febrero al 04 de marzo (2)
 
11 al 15 de julio
11 al 15 de julio11 al 15 de julio
11 al 15 de julio
 
Agenda 29 de febrero al 04 de marzo
Agenda 29 de febrero al 04 de marzoAgenda 29 de febrero al 04 de marzo
Agenda 29 de febrero al 04 de marzo
 
Semana 19
Semana 19Semana 19
Semana 19
 
типы химических связей
типы химических связейтипы химических связей
типы химических связей
 
Logistica elecciones 2014
Logistica elecciones 2014Logistica elecciones 2014
Logistica elecciones 2014
 
Semana 20 (1)
Semana 20 (1)Semana 20 (1)
Semana 20 (1)
 
Обзор периодической печати колледжа.
Обзор периодической печати колледжа.Обзор периодической печати колледжа.
Обзор периодической печати колледжа.
 
бенефис почтенной книге
бенефис почтенной книгебенефис почтенной книге
бенефис почтенной книге
 
гид2013
гид2013гид2013
гид2013
 
The role of social connections in shaping our preferences
The role of social connections in shaping our preferencesThe role of social connections in shaping our preferences
The role of social connections in shaping our preferences
 
тюз
тюзтюз
тюз
 
Методическое пособие по всем видам работ.
Методическое пособие по всем видам работ. Методическое пособие по всем видам работ.
Методическое пособие по всем видам работ.
 
Auditing search engines for differential satisfaction across demographics
Auditing search engines for differential satisfaction across demographicsAuditing search engines for differential satisfaction across demographics
Auditing search engines for differential satisfaction across demographics
 
фотоотчет о проведении акции молодежь против туберкулеза
фотоотчет о проведении акции молодежь против туберкулезафотоотчет о проведении акции молодежь против туберкулеза
фотоотчет о проведении акции молодежь против туберкулеза
 
From Excel to PowerPoint - Logos and Icons
From Excel to PowerPoint - Logos and IconsFrom Excel to PowerPoint - Logos and Icons
From Excel to PowerPoint - Logos and Icons
 

Similar to Causal inference in practice

What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldPyData
 
Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...
Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...
Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...Shift Conference
 
Measuring effectiveness of machine learning systems
Measuring effectiveness of machine learning systemsMeasuring effectiveness of machine learning systems
Measuring effectiveness of machine learning systemsAmit Sharma
 
Time Series Analysis
Time Series AnalysisTime Series Analysis
Time Series AnalysisAmanda Reed
 
Causal inference in practice: Here, there, causality is everywhere
Causal inference in practice: Here, there, causality is everywhereCausal inference in practice: Here, there, causality is everywhere
Causal inference in practice: Here, there, causality is everywhereAmit Sharma
 
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docxIHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docxwilcockiris
 
Business Optimization via Causal Inference
Business Optimization via Causal InferenceBusiness Optimization via Causal Inference
Business Optimization via Causal InferenceHanan Shteingart
 
Increasing precision in survey experiments without introducing bias
Increasing precision in survey experiments without introducing biasIncreasing precision in survey experiments without introducing bias
Increasing precision in survey experiments without introducing biasWilte Zijlstra
 
Sensitivity Analysis
Sensitivity AnalysisSensitivity Analysis
Sensitivity AnalysisBeth Johnson
 
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docxfelicidaddinwoodie
 
Measuring Risk - What Doesn’t Work and What Does
Measuring Risk - What Doesn’t Work and What DoesMeasuring Risk - What Doesn’t Work and What Does
Measuring Risk - What Doesn’t Work and What DoesJody Keyser
 
Research Design and Validity
Research Design and ValidityResearch Design and Validity
Research Design and ValidityHora Tjitra
 
A cutting edge behavioural approach to achieving your contact centre’s object...
A cutting edge behavioural approach to achieving your contact centre’s object...A cutting edge behavioural approach to achieving your contact centre’s object...
A cutting edge behavioural approach to achieving your contact centre’s object...Contact Centre Management Group
 
The Impact of Computing Systems | Causal inference in practice
The Impact of Computing Systems | Causal inference in practiceThe Impact of Computing Systems | Causal inference in practice
The Impact of Computing Systems | Causal inference in practiceAmit Sharma
 
Possible Essay Questions On Romeo And Juliet
Possible Essay Questions On Romeo And JulietPossible Essay Questions On Romeo And Juliet
Possible Essay Questions On Romeo And JulietJamie Jackson
 
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...Saurabh Mishra
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA
 
Fairness in Machine Learning and AI
Fairness in Machine Learning and AIFairness in Machine Learning and AI
Fairness in Machine Learning and AISeth Grimes
 

Similar to Causal inference in practice (20)

What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
 
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
 
Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...
Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...
Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...
 
Measuring effectiveness of machine learning systems
Measuring effectiveness of machine learning systemsMeasuring effectiveness of machine learning systems
Measuring effectiveness of machine learning systems
 
Time Series Analysis
Time Series AnalysisTime Series Analysis
Time Series Analysis
 
Causal inference in practice: Here, there, causality is everywhere
Causal inference in practice: Here, there, causality is everywhereCausal inference in practice: Here, there, causality is everywhere
Causal inference in practice: Here, there, causality is everywhere
 
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docxIHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
 
Business Optimization via Causal Inference
Business Optimization via Causal InferenceBusiness Optimization via Causal Inference
Business Optimization via Causal Inference
 
Increasing precision in survey experiments without introducing bias
Increasing precision in survey experiments without introducing biasIncreasing precision in survey experiments without introducing bias
Increasing precision in survey experiments without introducing bias
 
Sensitivity Analysis
Sensitivity AnalysisSensitivity Analysis
Sensitivity Analysis
 
Slalom
SlalomSlalom
Slalom
 
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx
 
Measuring Risk - What Doesn’t Work and What Does
Measuring Risk - What Doesn’t Work and What DoesMeasuring Risk - What Doesn’t Work and What Does
Measuring Risk - What Doesn’t Work and What Does
 
Research Design and Validity
Research Design and ValidityResearch Design and Validity
Research Design and Validity
 
A cutting edge behavioural approach to achieving your contact centre’s object...
A cutting edge behavioural approach to achieving your contact centre’s object...A cutting edge behavioural approach to achieving your contact centre’s object...
A cutting edge behavioural approach to achieving your contact centre’s object...
 
The Impact of Computing Systems | Causal inference in practice
The Impact of Computing Systems | Causal inference in practiceThe Impact of Computing Systems | Causal inference in practice
The Impact of Computing Systems | Causal inference in practice
 
Possible Essay Questions On Romeo And Juliet
Possible Essay Questions On Romeo And JulietPossible Essay Questions On Romeo And Juliet
Possible Essay Questions On Romeo And Juliet
 
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
 
Fairness in Machine Learning and AI
Fairness in Machine Learning and AIFairness in Machine Learning and AI
Fairness in Machine Learning and AI
 

More from Amit Sharma

Alleviating Privacy Attacks Using Causal Models
Alleviating Privacy Attacks Using Causal ModelsAlleviating Privacy Attacks Using Causal Models
Alleviating Privacy Attacks Using Causal ModelsAmit Sharma
 
DoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End toolDoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End toolAmit Sharma
 
Artificial Intelligence for Societal Impact
Artificial Intelligence for Societal ImpactArtificial Intelligence for Societal Impact
Artificial Intelligence for Societal ImpactAmit Sharma
 
Causal data mining: Identifying causal effects at scale
Causal data mining: Identifying causal effects at scaleCausal data mining: Identifying causal effects at scale
Causal data mining: Identifying causal effects at scaleAmit Sharma
 
Equivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
Equivalence causal frameworks: SEMs, Graphical models and Potential OutcomesEquivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
Equivalence causal frameworks: SEMs, Graphical models and Potential OutcomesAmit Sharma
 
Estimating the causal impact of recommender systems
Estimating the causal impact of recommender systemsEstimating the causal impact of recommender systems
Estimating the causal impact of recommender systemsAmit Sharma
 
Predictability of popularity on online social media: Gaps between prediction ...
Predictability of popularity on online social media: Gaps between prediction ...Predictability of popularity on online social media: Gaps between prediction ...
Predictability of popularity on online social media: Gaps between prediction ...Amit Sharma
 
Estimating influence of online activity feeds on people's actions
Estimating influence of online activity feeds on people's actionsEstimating influence of online activity feeds on people's actions
Estimating influence of online activity feeds on people's actionsAmit Sharma
 
The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...Amit Sharma
 
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...Amit Sharma
 
RSWEB 2013: A research platform for social recommendation
RSWEB 2013: A research platform for social recommendationRSWEB 2013: A research platform for social recommendation
RSWEB 2013: A research platform for social recommendationAmit Sharma
 

More from Amit Sharma (11)

Alleviating Privacy Attacks Using Causal Models
Alleviating Privacy Attacks Using Causal ModelsAlleviating Privacy Attacks Using Causal Models
Alleviating Privacy Attacks Using Causal Models
 
DoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End toolDoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End tool
 
Artificial Intelligence for Societal Impact
Artificial Intelligence for Societal ImpactArtificial Intelligence for Societal Impact
Artificial Intelligence for Societal Impact
 
Causal data mining: Identifying causal effects at scale
Causal data mining: Identifying causal effects at scaleCausal data mining: Identifying causal effects at scale
Causal data mining: Identifying causal effects at scale
 
Equivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
Equivalence causal frameworks: SEMs, Graphical models and Potential OutcomesEquivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
Equivalence causal frameworks: SEMs, Graphical models and Potential Outcomes
 
Estimating the causal impact of recommender systems
Estimating the causal impact of recommender systemsEstimating the causal impact of recommender systems
Estimating the causal impact of recommender systems
 
Predictability of popularity on online social media: Gaps between prediction ...
Predictability of popularity on online social media: Gaps between prediction ...Predictability of popularity on online social media: Gaps between prediction ...
Predictability of popularity on online social media: Gaps between prediction ...
 
Estimating influence of online activity feeds on people's actions
Estimating influence of online activity feeds on people's actionsEstimating influence of online activity feeds on people's actions
Estimating influence of online activity feeds on people's actions
 
The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...The interplay of personal preference and social influence in sharing networks...
The interplay of personal preference and social influence in sharing networks...
 
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on L...
 
RSWEB 2013: A research platform for social recommendation
RSWEB 2013: A research platform for social recommendationRSWEB 2013: A research platform for social recommendation
RSWEB 2013: A research platform for social recommendation
 

Recently uploaded

Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformationAreesha Ahmad
 
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Mohammad Khajehpour
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Silpa
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY1301aanya
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Silpa
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)AkefAfaneh2
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 

Recently uploaded (20)

Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 

Causal inference in practice

  • 1. Here, there, causality is everywhere AMIT SHARMA, MICROSOFT RESEARCH http://www.amitsharma.in @amt_shrma
  • 2. My route to causality Building recommender systems in social networks Conducting user experiments Estimating impact of recommendations and social feeds
  • 3. Causality is everywhere Spans every branch of science. ◦ Economics ◦ Political science ◦ Study of human behavior ◦ Biology and medicine ◦ Computer science (?) Spans centuries of thought. ◦ Aristotle: “To know, is to know the final cause.” Took us until 1930s to come up with the randomized experiment (Fisher). Still early days for estimating causal effects from observational data.
  • 4. Causality in economics David Card. The causal effect of education on earnings (1999) Conley and Heerwig. The Long-Term Effects of Military Conscription on Mortality: Estimates From the Vietnam-Era Draft Lottery (2012)
  • 5. Causality in political science Darrell West. Air Wars (2013) Chattopadhyay and Duflo. Women as Policy Makers: Evidence from a Randomized Policy Experiment in India (2004)
  • 6. Causality in human behavior Thistlewaithe and Campbell. Effect of public recognition of scholastic achievement (1960) Christakis and Fowler. The collective dynamics of smoking in a large social network (2008)
  • 7. Causality in biology and medicine Effect of Vitamin D deficiency on colon cancer Effect of heart attack surgery on long-term health of patient
  • 8. Causality in web applications Sharma and Cosley. Distinguishing between personal preference and homophily in online activity feeds (2016). Sharma, Hofman and Watts. Estimating the causal impact of recommender systems (2015).
  • 9. Counterfactual reasoning Correlation question: How well can X predict Y? ◦ Machine learning, Statistical estimation. Interventionist question: If X is changed to X’, what will be the value of Y? ◦ Experiments, Reinforcement learning, Contextual bandits. Counterfactual question: If X would have been X’, what would be the value of Y? ◦ Today’s focus.
  • 10. Estimating causal effects from observational data Why is causal inference hard? ◦ Simpson’s paradox The language of graphical models ◦ Backdoor criterion ◦ Frontdoor criterion Common approaches for causal inference ◦ Conditioning ◦ Mechanism-based ◦ Natural Experiments Example: Estimating causal impact of recommender systems
  • 11. Estimating the effectiveness of kidney stone treatment Treatment A Treatment B Small stones 93% (81/87) 87% (234/270) Large stones 73% (192/263) 69% (55/80) Both 78% (273/350) 83% (289/350) Julious and Mullee. Confounding and Simpson’s Paradox (1994). http://en.wikipedia.org/wiki/Simpson’s_paradox Two treatments for kidney stones Treatment A : 78% effective Treatment B : 83% effective
  • 12. Estimating ad placement on a search engine Suppose we would like to optimize the set of ads shown for a query, rather than optimize inidividually. Click probability estimates: q1, q2 Does q2 depend on q1? 1st, q1 2nd, q2
  • 13. Confounders in ad placement Let us define two groups with 2000 queries each: ◦ High q1: (149/2000) CTR on second ad ◦ Low q1: (124/2000) CTR on second ad Low q1 High q1 Low q2 5.1% (92/1823) 4.8% (71/1500) High q2 18.1% (32/176) 15.6% (78/500) Both 6.2% (124/2000) 7.5% (149/2000) Bottou et al. Counterfactual reasoning and learning systems (2013).
  • 14. Causal graphical models: a framework for causality Structural equation modeling (SEM) X = q1 Y = CTR on second ad
  • 15. Which variables to condition on? Observed variables ◦ Which observed variables? ◦ As we will see, observing on all variables may not be correct. Known unknowns: ◦ Age, Past diseases, Food intake Unknown unknowns: ◦ What else could impact recovery from kidney stones? ◦ Genetic markers?
  • 16. Which variables to condition on?
  • 17. Connections to Bayesian networks Markov assumption: Probability of an effect is independent of everything else given its direct causes. Two approaches: --Backdoor criterion --Frontdoor criterion
  • 18. Graphical Models and common methods for causal estimation Condition on observed covariates • Stratification • Matching • Regression (?) Mechanism-based strategies • Path-based approaches Natural experiments • As-if experiments • Instrumental Variables • Regression discontinuity
  • 19. I. Conditioning on observed covariates Corresponds to Backdoor criterion.
  • 20. a) Stratification Condition on different levels of socio- economic status.
  • 21. b) Matching Socio-Economic status is a function of parents’ income, locality and other observed indicators.
  • 22. b) Matching Model propensity to attend a particular school. Pschool = f(PI, Loc, …)
  • 23. c) Regression Condition on observed covariates by adding them as independent variables in regression. Works only if true causal relationship between variables is linear.
  • 25. III. Natural Experiments Look for experiments happening in the real world. Promise greater generalizability than controlled lab experiments. Require greater care to ensure validity of causal identification.
  • 29. Summary: Two graphical criteria explain all of conventional approaches A principled, succinct framework for causality. Allows arbitrary functional forms for relationships between variables. Leads to clear statements about causal assumptions. If a causal effect can be identified, it can be derived using do-calculus (helpful for bigger graphs).
  • 30. Product recommendati ons on Amazon Do recommendations expose people to new products? Do recommendations lead to more purchases?
  • 31. Counterfactual reasoning What would have happened in case there were no recommendations?
  • 32. X = Activity on current item that the user is viewing Y = Activity on the recommended Item UX = Latent properties of X UY = Latent Properties of Y Why is estimating effects of recommenda tions difficult using observational data? If latent properties for X and Y are correlated, then observed changes in AY cannot be directly attributed to AX. AX AY UYUX A causal graphical model for the impact of recommendations (ref. Pearl 09)
  • 33. AX = Visits on a product X on Amazon AY = Recommendation click-throughs from X to Y UX = Consumer demand for X UY = Consumer demand for Y If latent properties for X and Y are correlated, then observed changes in AY cannot be directly attributed to AX. AX AY UYUX A causal graphical model for the impact of recommendations
  • 34. Example: Looking for a machine learning book Observed clickthrough data due to recommendations do not tell the full story. For example, let’s assume I just completed the Artificial Intelligence book by Russell and Norvig and now I want to learn more about machine learning.
  • 35.
  • 36. Xi: Focal Product Yj: Recommended Products
  • 37. Xi: Focal Product Yj: Recommended Products Causal Link Convenience Link Revisi t Link Waste d Link There could be also be irrelevant links.
  • 38. The Shock strategy (I.V.) If direct visits to product Yj are nearly constant, then we can assume that the convenience clicks to Yj will be nearly constant. Thus,
  • 39. The Shock strategy We cannot say much during normal traffic for a product. But if a product experiences a spike in visits and its recommended product does not, then we can demonstrate a method to compute the causal clickthrough rate.
  • 40. Data description Dataset: Anonymized Amazon URL log data from Bing toolbar for opted-in users. Eight months (Sept. 1 2013 to May 31 2014). URL structure allows us to determine: ◦ Type of page visited (product, search, cart, bestsellers, wishlist) ◦ Type of referral to a product (recommendation, search, none, others) After filtering out bots, sellers, authors, publishers and unpopular products (<5 visits): ◦ Number of products = 1.38 M ◦ Number of users = 2.1M ◦ 60 product categories (such as Books, Toys, Electronics)
  • 41. Implementing the strategy: The shock criteria Large: Visits during a shock must exceed 5 times the median traffic for a product Sudden: Visits during a shock must be 5 times the last day’s traffic and 5 times the last week’s traffic Sane: Visits from at least 10 unique users and on 5 different days before and after a shock 4776 shocks to 4126 products
  • 42. Implementing the strategy: The shock criteria Additionally, we want direct visits to Yj be constant. Maximum change in direct visits to Yj should not bigger than the size of the shock. When beta=1, ideally causal. When beta=1, all bets are off. Good shock Bad shock (filtered out at beta=0.7)
  • 43. Results: Fraction of causal clickthroughs by category Majority of the clickthroughs are due to convenience. Within any category, 5% or lower is a more accurate estimate of clickthroughs caused by recommendations.
  • 44. Robustness checks Shocks may not be representative ◦ Distribution of users, popularity and the affinity between users and products does not see much difference (except that shocked products are, on average, more popular). Shocks may be caused by deals which make the focal product more attractive ◦ Verification using referrals from log data (e.g. bookbub.com) and manual inspection of past prices (from camelcamelcamel.com) Shocks may be a property of the weird holiday season. ◦ They occur throughout the data, although with more frequency during the holidays.
  • 45. Graphical models form a succinct, sound and complete framework for reasoning about causality. They can also be practical. THANK YOU! AMIT SHARMA, MICROSOFT RESEARCH http://www.amitsharma.in @amt_shrma

Editor's Notes

  1. Similarly, you can think about personalized, adaptive books.
  2. We are looking for causal clickthroughs
  3. I just read AI by Russell. Now I search. So there could be convenient, revisits, causal and wasted links.
  4. I just read AI by Russell. Now I search. So there could be convenient, revisits, causal and wasted links.
  5. I just read AI by Russell. Now I search. So there could be convenient, revisits, causal and wasted links. In case of a book, think of the search results as the contents in a book in the normal order. And maybe we want to personalize that.
  6. Going back to the causal diagram.
  7. Talk about filtering …how there were som
  8. Change the figure.