Causal inference in practice

Here, there,
causality is
everywhere
AMIT SHARMA, MICROSOFT RESEARCH
http://www.amitsharma.in
@amt_shrma

My route to causality
Building
recommender
systems in social
networks
Conducting user
experiments
Estimating impact of
recommendations
and social feeds

Causality is everywhere
Spans every branch of science.
◦ Economics
◦ Political science
◦ Study of human behavior
◦ Biology and medicine
◦ Computer science (?)
Spans centuries of thought.
◦ Aristotle: “To know, is to know the final cause.”
Took us until 1930s to come up with the randomized experiment
(Fisher).
Still early days for estimating causal effects from observational data.

Causality in economics
David Card. The causal effect of education on earnings (1999)
Conley and Heerwig. The Long-Term Effects of Military
Conscription on Mortality: Estimates From the Vietnam-Era Draft
Lottery (2012)

Causality in political science
Darrell West. Air Wars (2013)
Chattopadhyay and Duflo. Women as Policy Makers:
Evidence from a Randomized Policy Experiment in
India (2004)

Causality in human behavior
Thistlewaithe and Campbell. Effect of public recognition
of scholastic achievement (1960)
Christakis and Fowler. The collective dynamics
of smoking in a large social network (2008)

Causality in biology and
medicine
Effect of Vitamin D deficiency on colon cancer
Effect of heart attack surgery on long-term
health of patient

Causality in web applications
Sharma and Cosley. Distinguishing between personal preference
and homophily in online activity feeds (2016).
Sharma, Hofman and Watts. Estimating the causal impact of
recommender systems (2015).

Counterfactual reasoning
Correlation question: How well can X predict Y?
◦ Machine learning, Statistical estimation.
Interventionist question: If X is changed to X’, what will be
the value of Y?
◦ Experiments, Reinforcement learning, Contextual bandits.
Counterfactual question: If X would have been X’, what
would be the value of Y?
◦ Today’s focus.

Estimating causal effects from
observational data
Why is causal inference hard?
◦ Simpson’s paradox
The language of graphical models
◦ Backdoor criterion
◦ Frontdoor criterion
Common approaches for causal inference
◦ Conditioning
◦ Mechanism-based
◦ Natural Experiments
Example: Estimating causal impact of recommender systems

Estimating the effectiveness of
kidney stone treatment
Treatment A Treatment B
Small stones 93% (81/87) 87% (234/270)
Large stones 73% (192/263) 69% (55/80)
Both 78% (273/350) 83% (289/350)
Julious and Mullee. Confounding and Simpson’s Paradox (1994).
http://en.wikipedia.org/wiki/Simpson’s_paradox
Two treatments for kidney stones
Treatment A : 78% effective
Treatment B : 83% effective

Estimating ad placement on a
search engine
Suppose we would like to optimize the set of ads shown for a query,
rather than optimize inidividually.
Click probability
estimates: q1, q2
Does q2 depend on
q1?
1st, q1
2nd, q2

Confounders in ad placement
Let us define two groups with 2000 queries each:
◦ High q1: (149/2000) CTR on second ad
◦ Low q1: (124/2000) CTR on second ad
Low q1 High q1
Low q2 5.1% (92/1823) 4.8% (71/1500)
High q2 18.1% (32/176) 15.6% (78/500)
Both 6.2% (124/2000) 7.5% (149/2000)
Bottou et al. Counterfactual reasoning and learning systems
(2013).

Causal graphical models: a
framework for causality
Structural equation modeling (SEM)
X = q1
Y = CTR on second ad

Which variables to condition
on?
Observed variables
◦ Which observed variables?
◦ As we will see, observing on all variables may not be correct.
Known unknowns:
◦ Age, Past diseases, Food intake
Unknown unknowns:
◦ What else could impact recovery from kidney stones?
◦ Genetic markers?

Which variables to condition
on?

Connections to Bayesian
networks
Markov assumption: Probability of an effect is independent of
everything else given its direct causes.
Two
approaches:
--Backdoor
criterion
--Frontdoor
criterion

Graphical Models and common
methods for causal estimation
Condition on
observed covariates
• Stratification
• Matching
• Regression (?)
Mechanism-based
strategies
• Path-based
approaches
Natural experiments
• As-if experiments
• Instrumental
Variables
• Regression
discontinuity

I. Conditioning on observed
covariates
Corresponds to Backdoor criterion.

a) Stratification
Condition on different levels of socio-
economic status.

b) Matching
Socio-Economic status is a function of parents’ income,
locality and other observed indicators.

b) Matching
Model propensity to attend a particular school.
Pschool = f(PI, Loc, …)

c) Regression
Condition on observed covariates by
adding them as independent variables
in regression.
Works only if true causal
relationship between
variables is linear.

II. Mechanism-based
strategies
Corresponds to Front door criterion.

III. Natural Experiments
Look for experiments happening in the real world.
Promise greater generalizability than controlled lab experiments.
Require greater care to ensure validity of causal identification.

c) Instrumental variables
Shock!
Increase in
traffic

Summary: Two graphical criteria
explain all of conventional
approaches
A principled, succinct framework for causality.
Allows arbitrary functional forms for relationships between variables.
Leads to clear statements about causal assumptions.
If a causal effect can be identified, it can be derived using do-calculus
(helpful for bigger graphs).

Product
recommendati
ons on Amazon
Do recommendations expose
people to new products?
Do recommendations lead to
more purchases?

Counterfactual
reasoning
What would have
happened in case there
were no
recommendations?

X = Activity on current item that the user is
viewing
Y = Activity on the recommended Item
UX = Latent properties of X
UY = Latent Properties of Y
Why is
estimating
effects of
recommenda
tions difficult
using
observational
data?
If latent properties for X and Y
are correlated, then observed
changes in AY cannot be
directly attributed to AX.
AX AY
UYUX
A causal graphical model for the impact of recommendations
(ref. Pearl 09)

AX = Visits on a product X on Amazon
AY = Recommendation click-throughs from X
to Y
UX = Consumer demand for X
UY = Consumer demand for Y
If latent
properties for X
and Y are
correlated, then
observed
changes in AY
cannot be
directly
attributed to AX.
AX AY
UYUX
A causal graphical model for the impact of recommendations

Example:
Looking for a
machine
learning book
Observed clickthrough data
due to recommendations do
not tell the full story.
For example, let’s assume I just
completed the Artificial
Intelligence book by Russell
and Norvig and now I want to
learn more about machine
learning.

Xi: Focal Product
Yj: Recommended Products

Xi: Focal Product
Yj: Recommended Products
Causal
Link
Convenience
Link
Revisi
t Link
Waste
d Link
There could be also be irrelevant links.

The Shock strategy (I.V.)
If direct visits to product Yj are nearly constant, then we
can assume that the convenience clicks to Yj will be
nearly constant.
Thus,

The Shock strategy
We cannot say much during normal traffic for a product. But if a product experiences a spike in
visits and its recommended product does not, then we can demonstrate a method to compute
the causal clickthrough rate.

Data description
Dataset: Anonymized Amazon URL log data from Bing toolbar for opted-in
users.
Eight months (Sept. 1 2013 to May 31 2014).
URL structure allows us to determine:
◦ Type of page visited (product, search, cart, bestsellers, wishlist)
◦ Type of referral to a product (recommendation, search, none, others)
After filtering out bots, sellers, authors, publishers and unpopular products (<5
visits):
◦ Number of products = 1.38 M
◦ Number of users = 2.1M
◦ 60 product categories (such as Books, Toys, Electronics)

Implementing
the strategy:
The shock
criteria
Large: Visits during a shock
must exceed 5 times the
median traffic for a product
Sudden: Visits during a
shock must be 5 times the
last day’s traffic and 5 times
the last week’s traffic
Sane: Visits from at least 10
unique users and on 5
different days before and
after a shock
4776
shocks to
4126
products

Implementing the strategy: The shock
criteria
Additionally, we want direct visits to Yj be constant. Maximum change in direct visits to Yj should not bigger
than the size of the shock.
When beta=1, ideally causal. When beta=1, all bets are off.
Good shock Bad shock (filtered out at beta=0.7)

Results:
Fraction of
causal
clickthroughs
by category
Majority of the
clickthroughs are due to
convenience.
Within any category, 5% or
lower is a more accurate
estimate of clickthroughs
caused by
recommendations.

Robustness checks
Shocks may not be representative
◦ Distribution of users, popularity and the affinity between users and products
does not see much difference (except that shocked products are, on average,
more popular).
Shocks may be caused by deals which make the focal product more
attractive
◦ Verification using referrals from log data (e.g. bookbub.com) and manual
inspection of past prices (from camelcamelcamel.com)
Shocks may be a property of the weird holiday season.
◦ They occur throughout the data, although with more frequency during the
holidays.

Graphical models form a succinct,
sound and complete framework
for reasoning about causality.
They can also be practical.
THANK YOU!
AMIT SHARMA, MICROSOFT RESEARCH
http://www.amitsharma.in
@amt_shrma

Causal inference in practice

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Causal inference in practice

Similar to Causal inference in practice (20)

More from Amit Sharma

More from Amit Sharma (11)

Recently uploaded

Recently uploaded (20)

Causal inference in practice

Editor's Notes