SlideShare uma empresa Scribd logo
1 de 191
Baixar para ler offline
Bias in Recommendations
@ SIKS Course "Advances in Information Retrieval"
! David Graus
✉ david.graus@fdmediagroep.nl
🐦 @dvdgrs
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
whoami !
2
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
whoami !
• 🎓 Academia
• BA Media Studies @ UvA (2008)
• MSc Media Technology @ Universiteit Leiden (2012)
• PhD Information Retrieval @ UvA (2017)
2
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
whoami !
• 🎓 Academia
• BA Media Studies @ UvA (2008)
• MSc Media Technology @ Universiteit Leiden (2012)
• PhD Information Retrieval @ UvA (2017)
• 🏢 Industry
• Editor radio/online public broadcaster NTR (between BA & MSc)
• Research Intern @ Microsoft Research, US
• Data Scientist @ Company.info (FD Mediagroep)
• Lead Data Scientist @ FD SMART Journalism / BNR SMART Radio
2
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
In what is to follow…
3
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
In what is to follow…
• An introduction of FD Mediagroep
3
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
In what is to follow…
• An introduction of FD Mediagroep
• Personalization & RecSys at FD Mediagroep
3
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
In what is to follow…
• An introduction of FD Mediagroep
• Personalization & RecSys at FD Mediagroep
• Two flavors of bias in RecSys
• Model/Algorithmic bias
• Perceived bias in personalization
3
Part 1: Introduction
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 5
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
FD Mediagroup
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
The leading information provider in the financial economic domain
FD Mediagroup
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
The leading information provider in the financial economic domain
FD Mediagroup
in the Netherlands
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
The leading information provider in the financial economic domain
FD Mediagroup
in the Netherlands
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
The leading information provider in the financial economic domain
FD Mediagroup
in the Netherlands
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
AI @ FD Mediagroup
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
AI @ FD Mediagroup
10
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Team
11
Dung Bahadir Anca Philippe
Maya David Feng Li’ao
Klaus Oberon Manon Azamat
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Team
11
Dung Bahadir Anca Philippe
Maya David Feng Li’ao
Klaus Oberon Manon Azamat
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Team
11
Dung Bahadir Anca Philippe
Maya David Feng Li’ao
Klaus Oberon Manon Azamat
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Team
11
Dung Bahadir Anca Philippe
Maya David Feng Li’ao
Klaus Oberon Manon Azamat
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Team
11
Dung Bahadir Anca Philippe
Maya David Feng Li’ao
Klaus Oberon Manon Azamat
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
AI @ FDMG: Academia/Industry
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
AI @ FDMG: Academia/Industry
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
AI @ FDMG: Academia/Industry
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
AI @ FDMG: Academia/Industry
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
SMART Radio
• (Transcribe)
• Segment
• Tag
• Serve
14
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Transcribe
15
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Segment
• Based on metadata, 

text, and audio.
16
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Segment
• Based on metadata, 

text, and audio.
16
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Tag
• Simple multilabel text 

classifier
• Trained on transcripts of 

segments + associated tags 

from website
17
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Serve
• iOS/Android 

app
18
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Serve
• iOS/Android 

app
18
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Serve
• iOS/Android 

app
18
Part 2: SMART Journalism
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 20
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 20
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
SMART Journalism
21
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
SMART Journalism
• Moonshot; personalized summarization
21
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
SMART Journalism
• Moonshot; personalized summarization
• How to get there:
21
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
SMART Journalism
• Moonshot; personalized summarization
• How to get there:
• Content Understanding
21
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
SMART Journalism
• Moonshot; personalized summarization
• How to get there:
• Content Understanding
• Content-based Recommender System; <user, article>
21
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
SMART Journalism
• Moonshot; personalized summarization
• How to get there:
• Content Understanding
• Content-based Recommender System; <user, article>
• Personalized snippet retrieval; <user, snippet-in-article>
21
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
SMART Journalism
• Moonshot; personalized summarization
• How to get there:
• Content Understanding
• Content-based Recommender System; <user, article>
• Personalized snippet retrieval; <user, snippet-in-article>
• Snippet-to-summary abstractor (?)
21
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
SMART Journalism
• Moonshot; personalized summarization
• How to get there:
• Content Understanding
• Content-based Recommender System; <user, article>
• Personalized snippet retrieval; <user, snippet-in-article>
• Snippet-to-summary abstractor (?)
21
Tech
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 23
User Article
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 23
User Article
RecSys
Matching
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 23
User Article
RecSys
Matching
0.352
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 23
User Article
RecSys
Matching
0.352
0.795
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 23
User Article
RecSys
Matching
0.352
0.795
0.125
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 23
User Article
RecSys
Matching
0.352
0.795
0.125
0.643
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 23
User Article
RecSys
Matching
0.352
0.795
0.125
0.643
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 24
User Articles
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 24
User Articles
Reader
Profile
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 24
User Articles
Reader
Profile
Article
Profile
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 24
User Articles
RecSys
Matching
Reader
Profile
Article
Profile
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Article Representation
25
Article
Article
Profile
Article Representation
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Article Representation
25
Article
Article
Profile
'Meer regelgeving cryptogeld noodzakelijk'
Article Representation
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Article Representation
25
Article
Article
Profile
'Meer regelgeving cryptogeld noodzakelijk'
Article Representation
Tags: Blockchain, Cryptocurrency, Regelgeving
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Article Representation
25
Article
Article
Profile
'Meer regelgeving cryptogeld noodzakelijk'
Article Representation
Tags: Blockchain, Cryptocurrency, Regelgeving
Rubriek: Economie & Politiek
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Article Representation
25
Article
Article
Profile
'Meer regelgeving cryptogeld noodzakelijk'
Article Representation
Tags: Blockchain, Cryptocurrency, Regelgeving
Rubriek: Economie & Politiek
Stylometrie: CharLen=2424, WordLen=486
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Article Representation
25
Article
Article
Profile
'Meer regelgeving cryptogeld noodzakelijk'
Article Representation
Tags: Blockchain, Cryptocurrency, Regelgeving
Rubriek: Economie & Politiek
Stylometrie: CharLen=2424, WordLen=486
Entities: -
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
User Profile
26
User
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
User Profile
26
User
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
User Profile
26
User
Qualcomm krijgt bijna €1 mrd boete van Brussel
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
User Profile
26
User
Qualcomm krijgt bijna €1 mrd boete van Brussel
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
User Profile
26
User
Qualcomm krijgt bijna €1 mrd boete van Brussel
Tags: Boete, Chips, EU, Mededinging
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
User Profile
26
User
Qualcomm krijgt bijna €1 mrd boete van Brussel
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
User Profile
26
User
Qualcomm krijgt bijna €1 mrd boete van Brussel
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
Stylometrie: CharLen=3491, WordLen=635
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
User Profile
26
User
Qualcomm krijgt bijna €1 mrd boete van Brussel
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
Stylometrie: CharLen=3491, WordLen=635
Entities: Qualcomm, Apple, NXP, Intel, Google
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
User Profile
26
User
Qualcomm krijgt bijna €1 mrd boete van Brussel
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
Stylometrie: CharLen=3491, WordLen=635
Entities: Qualcomm, Apple, NXP, Intel, Google
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
User Profile
26
User
User
Profile
Qualcomm krijgt bijna €1 mrd boete van Brussel
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
Stylometrie: CharLen=3491, WordLen=635
Entities: Qualcomm, Apple, NXP, Intel, Google
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
User Profile
26
User
User
Profile
Qualcomm krijgt bijna €1 mrd boete van Brussel
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
Stylometrie: CharLen=3491, WordLen=635
Entities: Qualcomm, Apple, NXP, Intel, Google
Tags: Boete, Chips, EU, Mededinging
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
User Profile
26
User
User
Profile
Qualcomm krijgt bijna €1 mrd boete van Brussel
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
Stylometrie: CharLen=3491, WordLen=635
Entities: Qualcomm, Apple, NXP, Intel, Google
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
User Profile
26
User
User
Profile
Qualcomm krijgt bijna €1 mrd boete van Brussel
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
Stylometrie: CharLen=3491, WordLen=635
Entities: Qualcomm, Apple, NXP, Intel, Google
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
Stylometrie: CharLen=3491, WordLen=635
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
User Profile
26
User
User
Profile
Qualcomm krijgt bijna €1 mrd boete van Brussel
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
Stylometrie: CharLen=3491, WordLen=635
Entities: Qualcomm, Apple, NXP, Intel, Google
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
Stylometrie: CharLen=3491, WordLen=635
Entities: Qualcomm, Apple, NXP, Intel, Google
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 27
Qualcomm krijgt bijna €1 mrd boete van Brussel
Topman van softwaremaker Salesforce kraakt grote
techbedrijven
User
User
Profile
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
Stylometrie: CharLen=3491, WordLen=635
Entities: Qualcomm, Apple, NXP, Intel, Google
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 27
Qualcomm krijgt bijna €1 mrd boete van Brussel
Topman van softwaremaker Salesforce kraakt grote
techbedrijven
User
User
Profile
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
Stylometrie: CharLen=3491, WordLen=635
Entities: Qualcomm, Apple, NXP, Intel, Google
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 27
Qualcomm krijgt bijna €1 mrd boete van Brussel
Topman van softwaremaker Salesforce kraakt grote
techbedrijven
Tags: Big Data, Blog, Davos, Google, Technologie
User
User
Profile
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
Stylometrie: CharLen=3491, WordLen=635
Entities: Qualcomm, Apple, NXP, Intel, Google
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 27
Qualcomm krijgt bijna €1 mrd boete van Brussel
Topman van softwaremaker Salesforce kraakt grote
techbedrijven
Tags: Big Data, Blog, Davos, Google, Technologie
Rubriek: Davos
User
User
Profile
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
Stylometrie: CharLen=3491, WordLen=635
Entities: Qualcomm, Apple, NXP, Intel, Google
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 27
Qualcomm krijgt bijna €1 mrd boete van Brussel
Topman van softwaremaker Salesforce kraakt grote
techbedrijven
Tags: Big Data, Blog, Davos, Google, Technologie
Rubriek: Davos
Stylometrie: CharLen=2856, WordLen=524
User
User
Profile
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
Stylometrie: CharLen=3491, WordLen=635
Entities: Qualcomm, Apple, NXP, Intel, Google
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 27
Qualcomm krijgt bijna €1 mrd boete van Brussel
Topman van softwaremaker Salesforce kraakt grote
techbedrijven
Tags: Big Data, Blog, Davos, Google, Technologie
Rubriek: Davos
Stylometrie: CharLen=2856, WordLen=524
Entities: Google, Apple, Microsoft, Salesforce
User
User
Profile
Tags: Boete, Chips, EU, Mededinging
Rubriek: Ondernemen
Stylometrie: CharLen=3491, WordLen=635
Entities: Qualcomm, Apple, NXP, Intel, Google
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 27
Qualcomm krijgt bijna €1 mrd boete van Brussel
Tags: Boete, Chips, EU, Mededinging, Big
Data, Blog, Davos, Google, Technologie
Rubriek: Ondernemen, Davos
Stylometrie: CharLen=3491, WordLen=635, CharLen=2856,
WordLen=524
Entities: Qualcomm, Apple (2), NXP, Intel, Google (2), Microsoft,
Salesforce
Topman van softwaremaker Salesforce kraakt grote
techbedrijven
Tags: Big Data, Blog, Davos, Google, Technologie
Rubriek: Davos
Stylometrie: CharLen=2856, WordLen=524
Entities: Google, Apple, Microsoft, Salesforce
User
User
Profile
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Model
• Content-based RecSys
• Ranking w/ point-wise LTR
• Features: user, article, user-article features (~14k)
• Labels: implicit feedback
• Clicks (i.e., click = 1, non-click = 0)
• Trained nightly
28
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Bias?
• “Disproportionate weight in favor of or against an idea or thing,
usually in a way that is closed-minded, prejudicial, or unfair.”
29
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Bias in RecSys
“Algorithmic”
I. In Collaborative Filtering methods
II. In implicit feedback/clicks
30
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Collaborative
Filtering
31
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Collaborative
Filtering
31
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Bias in CF
32
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
[1.] Park & Tuzhilin. The Long Tail of Recommender Systems and How to Leverage It (RecSys ’08)
• It is more difficult to predict ratings of infrequently rated items in Collaborative
Filtering
Bias in CF
32
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
[1.] Park & Tuzhilin. The Long Tail of Recommender Systems and How to Leverage It (RecSys ’08)
• It is more difficult to predict ratings of infrequently rated items in Collaborative
Filtering
Bias in CF
32
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
[1.] Park & Tuzhilin. The Long Tail of Recommender Systems and How to Leverage It (RecSys ’08)
• It is more difficult to predict ratings of infrequently rated items in Collaborative
Filtering
• Bias: disproportionate weight in favor of popular items
Bias in CF
32
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
[1.] Park & Tuzhilin. The Long Tail of Recommender Systems and How to Leverage It (RecSys ’08)
[2.] Meyer, F. Recommender systems in industrial contexts (2012)
• It is more difficult to predict ratings of infrequently rated items in Collaborative
Filtering
• Bias: disproportionate weight in favor of popular items
• “It is generally not useful to recommend very popular items as they are generally
already known by the user” [2]
Bias in CF
32
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
[1.] Park & Tuzhilin. The Long Tail of Recommender Systems and How to Leverage It (RecSys ’08)
[2.] Meyer, F. Recommender systems in industrial contexts (2012)
[3.] Abdollahpouri et al., The Unfairness of Popularity Bias in Recommendation, RMSE@RecSys ’19
• It is more difficult to predict ratings of infrequently rated items in Collaborative
Filtering
• Bias: disproportionate weight in favor of popular items
• “It is generally not useful to recommend very popular items as they are generally
already known by the user” [2]
• “A market that suffers from popularity bias will lack opportunities to discover more
obscure products and will be, by definition, dominated by a few large brands […]” [3]
Bias in CF
32
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
[1.] Park & Tuzhilin. The Long Tail of Recommender Systems and How to Leverage It (RecSys ’08)
[2.] Meyer, F. Recommender systems in industrial contexts (2012)
[3.] Abdollahpouri et al., The Unfairness of Popularity Bias in Recommendation, RMSE@RecSys ’19
• It is more difficult to predict ratings of infrequently rated items in Collaborative
Filtering
• Bias: disproportionate weight in favor of popular items
• “It is generally not useful to recommend very popular items as they are generally
already known by the user” [2]
• “A market that suffers from popularity bias will lack opportunities to discover more
obscure products and will be, by definition, dominated by a few large brands […]” [3]
• Solution: cluster long-tail items
Bias in CF
32
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
[1.] Park & Tuzhilin. The Long Tail of Recommender Systems and How to Leverage It (RecSys ’08)
[2.] Meyer, F. Recommender systems in industrial contexts (2012)
[3.] Abdollahpouri et al., The Unfairness of Popularity Bias in Recommendation, RMSE@RecSys ’19
• It is more difficult to predict ratings of infrequently rated items in Collaborative
Filtering
• Bias: disproportionate weight in favor of popular items
• “It is generally not useful to recommend very popular items as they are generally
already known by the user” [2]
• “A market that suffers from popularity bias will lack opportunities to discover more
obscure products and will be, by definition, dominated by a few large brands […]” [3]
• Solution: cluster long-tail items
Bias in CF
32
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Bias in implicit feedback
33
Joachims et al., Accurately Interpreting Clickthrough Data as Implicit Feedback (SIGIR ’05)
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Bias in implicit feedback
• Popular items are overrepresented in implicit feedback
33
Joachims et al., Accurately Interpreting Clickthrough Data as Implicit Feedback (SIGIR ’05)
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Bias in implicit feedback
• Popular items are overrepresented in implicit feedback
• Position/“trust" bias (see Joachims et al., 2005)
• Eye-tracking study + comparison w/ explicit feedback shows;
• Clicks reflect relevance judgments
• Clicks ranked highly receive more clicks
33
Joachims et al., Accurately Interpreting Clickthrough Data as Implicit Feedback (SIGIR ’05)
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Bias in implicit feedback
• Popular items are overrepresented in implicit feedback
• Position/“trust" bias (see Joachims et al., 2005)
• Eye-tracking study + comparison w/ explicit feedback shows;
• Clicks reflect relevance judgments
• Clicks ranked highly receive more clicks
33
Joachims et al., Accurately Interpreting Clickthrough Data as Implicit Feedback (SIGIR ’05)
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Perceived Bias from RecSys
34
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Perceived Bias from RecSys
• A state of intellectual isolation that 

allegedly can result from personalized 

searches when a website algorithm 

selectively guesses what information a 

user would like to see based on 

information about the user.
• As a result, users become separated 

from information that disagrees with 

their viewpoints.
34
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Measuring personalization
35
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Measuring personalization
35
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Measuring personalization
• On average, 11.7% of results show differences due to
personalization on Google.
• Varies widely by search query and by result ranking.
• Only found measurable personalization as a result of searching
with a logged in account and the IP address of the searching user.
35
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
36[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
1. 👤
36[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
1. 👤
1. Get 200 volunteers with Google accounts
36[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
1. 👤
1. Get 200 volunteers with Google accounts
2. Have them issue the same set of queries
36[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
1. 👤
1. Get 200 volunteers with Google accounts
2. Have them issue the same set of queries
3. Compare results
36[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
1. 👤
1. Get 200 volunteers with Google accounts
2. Have them issue the same set of queries
3. Compare results
2. 🤖
36[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
1. 👤
1. Get 200 volunteers with Google accounts
2. Have them issue the same set of queries
3. Compare results
2. 🤖
1. Construct Google bot accounts
36[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
1. 👤
1. Get 200 volunteers with Google accounts
2. Have them issue the same set of queries
3. Compare results
2. 🤖
1. Construct Google bot accounts
• Vary aspects such as location, demographics, click behavior, browsing + search
history, etc.
36[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
1. 👤
1. Get 200 volunteers with Google accounts
2. Have them issue the same set of queries
3. Compare results
2. 🤖
1. Construct Google bot accounts
• Vary aspects such as location, demographics, click behavior, browsing + search
history, etc.
2. Have them issue the same set of queries
36[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
1. 👤
1. Get 200 volunteers with Google accounts
2. Have them issue the same set of queries
3. Compare results
2. 🤖
1. Construct Google bot accounts
• Vary aspects such as location, demographics, click behavior, browsing + search
history, etc.
2. Have them issue the same set of queries
3. Compare results
36[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
👤 Findings
37[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
👤 Findings
• On average, 11.7% of results show differences due to
personalization on Google.
37[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
👤 Findings
• On average, 11.7% of results show differences due to
personalization on Google.
• Top ranks tend to be less personalized than bottom ranks.
37[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
👤 Findings
38[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
👤 Findings
• ✅ Personalization based on location (e.g., company names)
38[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
👤 Findings
• ✅ Personalization based on location (e.g., company names)
• ❌ The least personalized results tend to be factual and health related
queries.
38[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
🤖 Findings
39[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
🤖 Findings
✅ Logged in vs. “cleared cookies” account
39[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
🤖 Findings
✅ Logged in vs. “cleared cookies” account
✅ Geolocation
39[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
🤖 Findings
✅ Logged in vs. “cleared cookies” account
✅ Geolocation
❌ Gender
39[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
🤖 Findings
✅ Logged in vs. “cleared cookies” account
✅ Geolocation
❌ Gender
❌ Age
39[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
🤖 Findings
✅ Logged in vs. “cleared cookies” account
✅ Geolocation
❌ Gender
❌ Age
❌ Search history
39[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
🤖 Findings
✅ Logged in vs. “cleared cookies” account
✅ Geolocation
❌ Gender
❌ Age
❌ Search history
❌ Click history
39[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
🤖 Findings
✅ Logged in vs. “cleared cookies” account
✅ Geolocation
❌ Gender
❌ Age
❌ Search history
❌ Click history
❌ Browsing history
39[Hannák et al., 2013]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Diversity to pop the filter bubble
40
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Diversity to pop the filter bubble
40
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
41[Nguyen et al., 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
• Split MovieLens users into two groups:
41[Nguyen et al., 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
• Split MovieLens users into two groups:
• “Followers”: users who rated movies they were recommended
41[Nguyen et al., 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
• Split MovieLens users into two groups:
• “Followers”: users who rated movies they were recommended
• “Ignorers”: users who rated movies they were not
recommended
41[Nguyen et al., 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
• Split MovieLens users into two groups:
• “Followers”: users who rated movies they were recommended
• “Ignorers”: users who rated movies they were not
recommended
• Compare between groups, over time:
41[Nguyen et al., 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
• Split MovieLens users into two groups:
• “Followers”: users who rated movies they were recommended
• “Ignorers”: users who rated movies they were not
recommended
• Compare between groups, over time:
• Diversity of recommendations
41[Nguyen et al., 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
• Split MovieLens users into two groups:
• “Followers”: users who rated movies they were recommended
• “Ignorers”: users who rated movies they were not
recommended
• Compare between groups, over time:
• Diversity of recommendations
• Ratings of movies
41[Nguyen et al., 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings
42[Nguyen et al., 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings
1. Diversity
42[Nguyen et al., 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings
1. Diversity
• In both groups, diversity decreases over time.
42[Nguyen et al., 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings
1. Diversity
• In both groups, diversity decreases over time.
• The effect is lessened for users who consume recommended
items (followers)
42[Nguyen et al., 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings
1. Diversity
• In both groups, diversity decreases over time.
• The effect is lessened for users who consume recommended
items (followers)
2. Ratings
42[Nguyen et al., 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings
1. Diversity
• In both groups, diversity decreases over time.
• The effect is lessened for users who consume recommended
items (followers)
2. Ratings
• Slight decrease in average ratings for ignorers (3.74 to 3.55).
42[Nguyen et al., 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings
1. Diversity
• In both groups, diversity decreases over time.
• The effect is lessened for users who consume recommended
items (followers)
2. Ratings
• Slight decrease in average ratings for ignorers (3.74 to 3.55).
• Stable average ratings for followers (~3.68).
42[Nguyen et al., 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Diversity in RecSys 🤖 vs. humans 👤?
43
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Diversity in RecSys 🤖 vs. humans 👤?
43
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
44[Möller et al. 2018]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
• 🤖 Generate article recommendations for news articles using
different RecSys algorithms (CF & CB).
44[Möller et al. 2018]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
• 🤖 Generate article recommendations for news articles using
different RecSys algorithms (CF & CB).
• 👤 Compare to hand-picked article recommendations.
44[Möller et al. 2018]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
• 🤖 Generate article recommendations for news articles using
different RecSys algorithms (CF & CB).
• 👤 Compare to hand-picked article recommendations.
• Measure & compare “diversity” of recommended articles:
• At content level
• At tag level
• At category level
• At sentiment/subjectivity level
44[Möller et al. 2018]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings
45[Möller et al. 2018]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings
45[Möller et al. 2018]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings
45[Möller et al. 2018]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings
45[Möller et al. 2018]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings
45[Möller et al. 2018]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings
45[Möller et al. 2018]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings
“Conventional recommendation algorithms at least preserve the
topic/sentiment diversity of the article supply.”
45[Möller et al. 2018]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
More diversity
46
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
More diversity
46
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Aim
Increase exposure to varied political opinions 

with a goal of improving civil discourse
47[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
• Classify searchers into political leaning (using geo data)
48[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
49[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
• Infer political leaning of news sources from user behavior.
49[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
• Infer political leaning of news sources from user behavior.
49[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
• Infer political leaning of news sources from user behavior.
49[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
• Infer political leaning of news sources from user behavior.
• Identify polarized search queries (with strong political leanings —
in both directions).
49[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
50[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
• Treatment group: Insert red results for blue users, and blue
results for red users
50[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
• Treatment group: Insert red results for blue users, and blue
results for red users
• Control group: Do not adjust results
50[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
51
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
1. Short term: Compare clicks/behavior between control &
treatment.
51
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
1. Short term: Compare clicks/behavior between control &
treatment.
2. Long term: Measure during two weeks, per user;
51
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
1. Short term: Compare clicks/behavior between control &
treatment.
2. Long term: Measure during two weeks, per user;
1. Polarization: Difference of user’s leaning-score compared to
average leaning across all sources.
51
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Method
1. Short term: Compare clicks/behavior between control &
treatment.
2. Long term: Measure during two weeks, per user;
1. Polarization: Difference of user’s leaning-score compared to
average leaning across all sources.
2. Engagement: Average number of queries + average read
articles.
51
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings I
52[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings I
• Less clicks on inserted opposing sources.
52[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings I
• Less clicks on inserted opposing sources.
• But: 

“Results pages of the opposing viewpoint which had a similarity
higher than the average tended to be clicked 38% more than those
below the average.”
52[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings II
53[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings II
• Polarization:
53[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings II
• Polarization:
• Treatment: Average leaning ‘moves’ ~25% to centre
53[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings II
• Polarization:
• Treatment: Average leaning ‘moves’ ~25% to centre
• Control: Negligible difference (~1%)
53[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings II
• Polarization:
• Treatment: Average leaning ‘moves’ ~25% to centre
• Control: Negligible difference (~1%)
• Engagement:
53[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings II
• Polarization:
• Treatment: Average leaning ‘moves’ ~25% to centre
• Control: Negligible difference (~1%)
• Engagement:
• Treatment: Number of queries: +9% / articles read: +4%
53[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Findings II
• Polarization:
• Treatment: Average leaning ‘moves’ ~25% to centre
• Control: Negligible difference (~1%)
• Engagement:
• Treatment: Number of queries: +9% / articles read: +4%
• Control: Small reduction in both (~2.5%)
53[Yom-Tov et al. 2014]
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
Refs
Algorithmic bias
1. Park & Tuzhilin, The Long Tail of Recommender Systems and How to Leverage It (RecSys ’08)
2. Meyer, Recommender systems in industrial contexts (2012)
3. Abdollahpouri et al., The Unfairness of Popularity Bias in Recommendation (RMSE@RecSys ’19)
4. Joachims et al., Accurately Interpreting Clickthrough Data as Implicit Feedback (SIGIR ’05)
Perceived bias / filter bubbles
5. Hannak et al., Measuring personalization of web search (WWW ’13)
6. Nguyen et al., Exploring the filter bubble: the effect of using recommender systems on content diversity (WWW ’14)
7. Möller et al., Do not blame it on the algorithm — An empirical assessment of multiple recommender systems and their impact
on content diversity (Information Communication and Society ’18)
8. Yom-Tov et al., Promoting Civil Discourse Through Search Engine Diversity (Social Science Computer Review, ’13)
54
David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 55

Mais conteúdo relacionado

Semelhante a Bias in Recommendations

MiLab Presentation 2018
MiLab Presentation 2018MiLab Presentation 2018
MiLab Presentation 2018Cindy Royal
 
What makes it worth becoming a Data Engineer?
What makes it worth becoming a Data Engineer?What makes it worth becoming a Data Engineer?
What makes it worth becoming a Data Engineer?Hadi Fadlallah
 
"Blue Commons" - Shared Cultural Value of Water & Public Space
"Blue Commons" - Shared Cultural Value of Water & Public Space"Blue Commons" - Shared Cultural Value of Water & Public Space
"Blue Commons" - Shared Cultural Value of Water & Public SpaceCarter Craft
 
Master of Public Administration (MPA) Academic Plan of Deepak (Danny) Singh a...
Master of Public Administration (MPA) Academic Plan of Deepak (Danny) Singh a...Master of Public Administration (MPA) Academic Plan of Deepak (Danny) Singh a...
Master of Public Administration (MPA) Academic Plan of Deepak (Danny) Singh a...Danny Singh, M.B.A., MSEd
 
The Power of Social Media Monitoring
The Power of Social Media MonitoringThe Power of Social Media Monitoring
The Power of Social Media MonitoringNaveen Krishnamurthy
 
Leeds School of Design: Outside in Designing for citizens: Talk 2
Leeds School of Design:  Outside in Designing for citizens: Talk 2 Leeds School of Design:  Outside in Designing for citizens: Talk 2
Leeds School of Design: Outside in Designing for citizens: Talk 2 Coca Rivas
 
Digital experience insights - through the eyes of students and staff
Digital experience insights - through the eyes of students and staffDigital experience insights - through the eyes of students and staff
Digital experience insights - through the eyes of students and staffJisc
 
What the DCC Can do for you
What the DCC Can do for youWhat the DCC Can do for you
What the DCC Can do for youMarieke Guy
 
Digital Stewardship Education at the Graduate School of Library & Information...
Digital Stewardship Education at the Graduate School of Library & Information...Digital Stewardship Education at the Graduate School of Library & Information...
Digital Stewardship Education at the Graduate School of Library & Information...DigCurV
 
Socialmedia driving Quality Improvement!
Socialmedia driving Quality Improvement!Socialmedia driving Quality Improvement!
Socialmedia driving Quality Improvement!Maria de Lourdes
 
Plans for the University of Virginia School of Data Science
Plans for the University of Virginia School of Data SciencePlans for the University of Virginia School of Data Science
Plans for the University of Virginia School of Data ScienceMelissa Moody
 
ACE Advising Research Workshop Series 5: Creating a Research Proposal
ACE Advising Research Workshop Series 5: Creating a Research ProposalACE Advising Research Workshop Series 5: Creating a Research Proposal
ACE Advising Research Workshop Series 5: Creating a Research Proposalmgabra18
 
DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)
DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)
DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)University of California Curation Center
 
Project Management Introduction
Project Management IntroductionProject Management Introduction
Project Management IntroductionRebekahSamuel2
 
DMP Tool at University of Sydney
DMP Tool at University of SydneyDMP Tool at University of Sydney
DMP Tool at University of SydneyARDC
 
Don't Mention the G Word - How the University of Sheffield got Googled
Don't Mention the G Word - How the University of Sheffield got GoogledDon't Mention the G Word - How the University of Sheffield got Googled
Don't Mention the G Word - How the University of Sheffield got GoogledAndy Tattersall
 

Semelhante a Bias in Recommendations (20)

RDM skills
RDM skillsRDM skills
RDM skills
 
MiLab Presentation 2018
MiLab Presentation 2018MiLab Presentation 2018
MiLab Presentation 2018
 
What makes it worth becoming a Data Engineer?
What makes it worth becoming a Data Engineer?What makes it worth becoming a Data Engineer?
What makes it worth becoming a Data Engineer?
 
"Blue Commons" - Shared Cultural Value of Water & Public Space
"Blue Commons" - Shared Cultural Value of Water & Public Space"Blue Commons" - Shared Cultural Value of Water & Public Space
"Blue Commons" - Shared Cultural Value of Water & Public Space
 
Master of Public Administration (MPA) Academic Plan of Deepak (Danny) Singh a...
Master of Public Administration (MPA) Academic Plan of Deepak (Danny) Singh a...Master of Public Administration (MPA) Academic Plan of Deepak (Danny) Singh a...
Master of Public Administration (MPA) Academic Plan of Deepak (Danny) Singh a...
 
The Power of Social Media Monitoring
The Power of Social Media MonitoringThe Power of Social Media Monitoring
The Power of Social Media Monitoring
 
Leeds School of Design: Outside in Designing for citizens: Talk 2
Leeds School of Design:  Outside in Designing for citizens: Talk 2 Leeds School of Design:  Outside in Designing for citizens: Talk 2
Leeds School of Design: Outside in Designing for citizens: Talk 2
 
Digital experience insights - through the eyes of students and staff
Digital experience insights - through the eyes of students and staffDigital experience insights - through the eyes of students and staff
Digital experience insights - through the eyes of students and staff
 
What the DCC Can do for you
What the DCC Can do for youWhat the DCC Can do for you
What the DCC Can do for you
 
Digital Stewardship Education at the Graduate School of Library & Information...
Digital Stewardship Education at the Graduate School of Library & Information...Digital Stewardship Education at the Graduate School of Library & Information...
Digital Stewardship Education at the Graduate School of Library & Information...
 
Socialmedia driving Quality Improvement!
Socialmedia driving Quality Improvement!Socialmedia driving Quality Improvement!
Socialmedia driving Quality Improvement!
 
Plans for the University of Virginia School of Data Science
Plans for the University of Virginia School of Data SciencePlans for the University of Virginia School of Data Science
Plans for the University of Virginia School of Data Science
 
DASA Security Showcase - DASA Presentation
DASA Security Showcase - DASA PresentationDASA Security Showcase - DASA Presentation
DASA Security Showcase - DASA Presentation
 
ACE Advising Research Workshop Series 5: Creating a Research Proposal
ACE Advising Research Workshop Series 5: Creating a Research ProposalACE Advising Research Workshop Series 5: Creating a Research Proposal
ACE Advising Research Workshop Series 5: Creating a Research Proposal
 
DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)
DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)
DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)
 
UCD WST 23 October 2019
UCD WST 23 October 2019UCD WST 23 October 2019
UCD WST 23 October 2019
 
Project Management Introduction
Project Management IntroductionProject Management Introduction
Project Management Introduction
 
DMP Tool at University of Sydney
DMP Tool at University of SydneyDMP Tool at University of Sydney
DMP Tool at University of Sydney
 
Introduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD StudentsIntroduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD Students
 
Don't Mention the G Word - How the University of Sheffield got Googled
Don't Mention the G Word - How the University of Sheffield got GoogledDon't Mention the G Word - How the University of Sheffield got Googled
Don't Mention the G Word - How the University of Sheffield got Googled
 

Mais de David Graus

Pragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientistsPragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientistsDavid Graus
 
CAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for ImpactCAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for ImpactDavid Graus
 
Opening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender SystemsOpening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender SystemsDavid Graus
 
Zoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacyZoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacyDavid Graus
 
Layman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital TracesLayman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital TracesDavid Graus
 
Financial News Mining @ PyData Amsterdam
Financial News Mining @ PyData AmsterdamFinancial News Mining @ PyData Amsterdam
Financial News Mining @ PyData AmsterdamDavid Graus
 
De Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgevenDe Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgevenDavid Graus
 
Financial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.infoFinancial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.infoDavid Graus
 
Big Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & ValkuilenBig Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & ValkuilenDavid Graus
 
Analyzing and Predicting Task Reminders
Analyzing and Predicting Task RemindersAnalyzing and Predicting Task Reminders
Analyzing and Predicting Task RemindersDavid Graus
 
Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDavid Graus
 
Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDavid Graus
 
Understanding Email Traffic
Understanding Email TrafficUnderstanding Email Traffic
Understanding Email TrafficDavid Graus
 
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27thDavid Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27thDavid Graus
 
Understanding Email Traffic (talk @ E-Discovery NL Symposium)
Understanding Email Traffic (talk @ E-Discovery NL Symposium)Understanding Email Traffic (talk @ E-Discovery NL Symposium)
Understanding Email Traffic (talk @ E-Discovery NL Symposium)David Graus
 
Generating Pseudo-ground Truth for Detecting New Concepts in Social Streams
Generating Pseudo-ground Truth for Detecting New Concepts in Social StreamsGenerating Pseudo-ground Truth for Detecting New Concepts in Social Streams
Generating Pseudo-ground Truth for Detecting New Concepts in Social StreamsDavid Graus
 
yourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsyourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsDavid Graus
 
Semantic Search in E-Discovery
Semantic Search in E-DiscoverySemantic Search in E-Discovery
Semantic Search in E-DiscoveryDavid Graus
 
Semantic Annotation of the Cyttron Database
Semantic Annotation of the Cyttron DatabaseSemantic Annotation of the Cyttron Database
Semantic Annotation of the Cyttron DatabaseDavid Graus
 
Semantic annotation, clustering and visualization
Semantic annotation, clustering and visualizationSemantic annotation, clustering and visualization
Semantic annotation, clustering and visualizationDavid Graus
 

Mais de David Graus (20)

Pragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientistsPragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientists
 
CAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for ImpactCAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for Impact
 
Opening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender SystemsOpening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender Systems
 
Zoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacyZoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacy
 
Layman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital TracesLayman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital Traces
 
Financial News Mining @ PyData Amsterdam
Financial News Mining @ PyData AmsterdamFinancial News Mining @ PyData Amsterdam
Financial News Mining @ PyData Amsterdam
 
De Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgevenDe Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgeven
 
Financial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.infoFinancial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.info
 
Big Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & ValkuilenBig Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & Valkuilen
 
Analyzing and Predicting Task Reminders
Analyzing and Predicting Task RemindersAnalyzing and Predicting Task Reminders
Analyzing and Predicting Task Reminders
 
Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity Ranking
 
Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity Ranking
 
Understanding Email Traffic
Understanding Email TrafficUnderstanding Email Traffic
Understanding Email Traffic
 
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27thDavid Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
 
Understanding Email Traffic (talk @ E-Discovery NL Symposium)
Understanding Email Traffic (talk @ E-Discovery NL Symposium)Understanding Email Traffic (talk @ E-Discovery NL Symposium)
Understanding Email Traffic (talk @ E-Discovery NL Symposium)
 
Generating Pseudo-ground Truth for Detecting New Concepts in Social Streams
Generating Pseudo-ground Truth for Detecting New Concepts in Social StreamsGenerating Pseudo-ground Truth for Detecting New Concepts in Social Streams
Generating Pseudo-ground Truth for Detecting New Concepts in Social Streams
 
yourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsyourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic events
 
Semantic Search in E-Discovery
Semantic Search in E-DiscoverySemantic Search in E-Discovery
Semantic Search in E-Discovery
 
Semantic Annotation of the Cyttron Database
Semantic Annotation of the Cyttron DatabaseSemantic Annotation of the Cyttron Database
Semantic Annotation of the Cyttron Database
 
Semantic annotation, clustering and visualization
Semantic annotation, clustering and visualizationSemantic annotation, clustering and visualization
Semantic annotation, clustering and visualization
 

Último

English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Milind Agarwal
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...KarteekMane1
 

Último (20)

English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
 

Bias in Recommendations

  • 1. Bias in Recommendations @ SIKS Course "Advances in Information Retrieval" ! David Graus ✉ david.graus@fdmediagroep.nl 🐦 @dvdgrs
  • 2. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 whoami ! 2
  • 3. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 whoami ! • 🎓 Academia • BA Media Studies @ UvA (2008) • MSc Media Technology @ Universiteit Leiden (2012) • PhD Information Retrieval @ UvA (2017) 2
  • 4. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 whoami ! • 🎓 Academia • BA Media Studies @ UvA (2008) • MSc Media Technology @ Universiteit Leiden (2012) • PhD Information Retrieval @ UvA (2017) • 🏢 Industry • Editor radio/online public broadcaster NTR (between BA & MSc) • Research Intern @ Microsoft Research, US • Data Scientist @ Company.info (FD Mediagroep) • Lead Data Scientist @ FD SMART Journalism / BNR SMART Radio 2
  • 5. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 In what is to follow… 3
  • 6. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 In what is to follow… • An introduction of FD Mediagroep 3
  • 7. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 In what is to follow… • An introduction of FD Mediagroep • Personalization & RecSys at FD Mediagroep 3
  • 8. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 In what is to follow… • An introduction of FD Mediagroep • Personalization & RecSys at FD Mediagroep • Two flavors of bias in RecSys • Model/Algorithmic bias • Perceived bias in personalization 3
  • 10. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 5
  • 11. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 FD Mediagroup
  • 12. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 The leading information provider in the financial economic domain FD Mediagroup
  • 13. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 The leading information provider in the financial economic domain FD Mediagroup in the Netherlands
  • 14. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 The leading information provider in the financial economic domain FD Mediagroup in the Netherlands
  • 15. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 The leading information provider in the financial economic domain FD Mediagroup in the Netherlands
  • 16. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
  • 17. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019
  • 18. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 AI @ FD Mediagroup
  • 19. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 AI @ FD Mediagroup 10
  • 20. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Team 11 Dung Bahadir Anca Philippe Maya David Feng Li’ao Klaus Oberon Manon Azamat
  • 21. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Team 11 Dung Bahadir Anca Philippe Maya David Feng Li’ao Klaus Oberon Manon Azamat
  • 22. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Team 11 Dung Bahadir Anca Philippe Maya David Feng Li’ao Klaus Oberon Manon Azamat
  • 23. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Team 11 Dung Bahadir Anca Philippe Maya David Feng Li’ao Klaus Oberon Manon Azamat
  • 24. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Team 11 Dung Bahadir Anca Philippe Maya David Feng Li’ao Klaus Oberon Manon Azamat
  • 25. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 AI @ FDMG: Academia/Industry
  • 26. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 AI @ FDMG: Academia/Industry
  • 27. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 AI @ FDMG: Academia/Industry
  • 28. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 AI @ FDMG: Academia/Industry
  • 29.
  • 30. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 SMART Radio • (Transcribe) • Segment • Tag • Serve 14
  • 31. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Transcribe 15
  • 32. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Segment • Based on metadata, 
 text, and audio. 16
  • 33. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Segment • Based on metadata, 
 text, and audio. 16
  • 34. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Tag • Simple multilabel text 
 classifier • Trained on transcripts of 
 segments + associated tags 
 from website 17
  • 35. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Serve • iOS/Android 
 app 18
  • 36. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Serve • iOS/Android 
 app 18
  • 37. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Serve • iOS/Android 
 app 18
  • 38. Part 2: SMART Journalism
  • 39. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 20
  • 40. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 20
  • 41. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 SMART Journalism 21
  • 42. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 SMART Journalism • Moonshot; personalized summarization 21
  • 43. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 SMART Journalism • Moonshot; personalized summarization • How to get there: 21
  • 44. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 SMART Journalism • Moonshot; personalized summarization • How to get there: • Content Understanding 21
  • 45. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 SMART Journalism • Moonshot; personalized summarization • How to get there: • Content Understanding • Content-based Recommender System; <user, article> 21
  • 46. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 SMART Journalism • Moonshot; personalized summarization • How to get there: • Content Understanding • Content-based Recommender System; <user, article> • Personalized snippet retrieval; <user, snippet-in-article> 21
  • 47. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 SMART Journalism • Moonshot; personalized summarization • How to get there: • Content Understanding • Content-based Recommender System; <user, article> • Personalized snippet retrieval; <user, snippet-in-article> • Snippet-to-summary abstractor (?) 21
  • 48. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 SMART Journalism • Moonshot; personalized summarization • How to get there: • Content Understanding • Content-based Recommender System; <user, article> • Personalized snippet retrieval; <user, snippet-in-article> • Snippet-to-summary abstractor (?) 21
  • 49. Tech
  • 50. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 23 User Article
  • 51. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 23 User Article RecSys Matching
  • 52. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 23 User Article RecSys Matching 0.352
  • 53. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 23 User Article RecSys Matching 0.352 0.795
  • 54. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 23 User Article RecSys Matching 0.352 0.795 0.125
  • 55. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 23 User Article RecSys Matching 0.352 0.795 0.125 0.643
  • 56. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 23 User Article RecSys Matching 0.352 0.795 0.125 0.643
  • 57. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 24 User Articles
  • 58. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 24 User Articles Reader Profile
  • 59. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 24 User Articles Reader Profile Article Profile
  • 60. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 24 User Articles RecSys Matching Reader Profile Article Profile
  • 61. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Article Representation 25 Article Article Profile Article Representation
  • 62. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Article Representation 25 Article Article Profile 'Meer regelgeving cryptogeld noodzakelijk' Article Representation
  • 63. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Article Representation 25 Article Article Profile 'Meer regelgeving cryptogeld noodzakelijk' Article Representation Tags: Blockchain, Cryptocurrency, Regelgeving
  • 64. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Article Representation 25 Article Article Profile 'Meer regelgeving cryptogeld noodzakelijk' Article Representation Tags: Blockchain, Cryptocurrency, Regelgeving Rubriek: Economie & Politiek
  • 65. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Article Representation 25 Article Article Profile 'Meer regelgeving cryptogeld noodzakelijk' Article Representation Tags: Blockchain, Cryptocurrency, Regelgeving Rubriek: Economie & Politiek Stylometrie: CharLen=2424, WordLen=486
  • 66. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Article Representation 25 Article Article Profile 'Meer regelgeving cryptogeld noodzakelijk' Article Representation Tags: Blockchain, Cryptocurrency, Regelgeving Rubriek: Economie & Politiek Stylometrie: CharLen=2424, WordLen=486 Entities: -
  • 67. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 User Profile 26 User
  • 68. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 User Profile 26 User
  • 69. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 User Profile 26 User Qualcomm krijgt bijna €1 mrd boete van Brussel
  • 70. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 User Profile 26 User Qualcomm krijgt bijna €1 mrd boete van Brussel
  • 71. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 User Profile 26 User Qualcomm krijgt bijna €1 mrd boete van Brussel Tags: Boete, Chips, EU, Mededinging
  • 72. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 User Profile 26 User Qualcomm krijgt bijna €1 mrd boete van Brussel Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen
  • 73. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 User Profile 26 User Qualcomm krijgt bijna €1 mrd boete van Brussel Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen Stylometrie: CharLen=3491, WordLen=635
  • 74. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 User Profile 26 User Qualcomm krijgt bijna €1 mrd boete van Brussel Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen Stylometrie: CharLen=3491, WordLen=635 Entities: Qualcomm, Apple, NXP, Intel, Google
  • 75. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 User Profile 26 User Qualcomm krijgt bijna €1 mrd boete van Brussel Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen Stylometrie: CharLen=3491, WordLen=635 Entities: Qualcomm, Apple, NXP, Intel, Google
  • 76. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 User Profile 26 User User Profile Qualcomm krijgt bijna €1 mrd boete van Brussel Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen Stylometrie: CharLen=3491, WordLen=635 Entities: Qualcomm, Apple, NXP, Intel, Google
  • 77. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 User Profile 26 User User Profile Qualcomm krijgt bijna €1 mrd boete van Brussel Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen Stylometrie: CharLen=3491, WordLen=635 Entities: Qualcomm, Apple, NXP, Intel, Google Tags: Boete, Chips, EU, Mededinging
  • 78. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 User Profile 26 User User Profile Qualcomm krijgt bijna €1 mrd boete van Brussel Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen Stylometrie: CharLen=3491, WordLen=635 Entities: Qualcomm, Apple, NXP, Intel, Google Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen
  • 79. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 User Profile 26 User User Profile Qualcomm krijgt bijna €1 mrd boete van Brussel Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen Stylometrie: CharLen=3491, WordLen=635 Entities: Qualcomm, Apple, NXP, Intel, Google Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen Stylometrie: CharLen=3491, WordLen=635
  • 80. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 User Profile 26 User User Profile Qualcomm krijgt bijna €1 mrd boete van Brussel Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen Stylometrie: CharLen=3491, WordLen=635 Entities: Qualcomm, Apple, NXP, Intel, Google Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen Stylometrie: CharLen=3491, WordLen=635 Entities: Qualcomm, Apple, NXP, Intel, Google
  • 81. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 27 Qualcomm krijgt bijna €1 mrd boete van Brussel Topman van softwaremaker Salesforce kraakt grote techbedrijven User User Profile Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen Stylometrie: CharLen=3491, WordLen=635 Entities: Qualcomm, Apple, NXP, Intel, Google
  • 82. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 27 Qualcomm krijgt bijna €1 mrd boete van Brussel Topman van softwaremaker Salesforce kraakt grote techbedrijven User User Profile Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen Stylometrie: CharLen=3491, WordLen=635 Entities: Qualcomm, Apple, NXP, Intel, Google
  • 83. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 27 Qualcomm krijgt bijna €1 mrd boete van Brussel Topman van softwaremaker Salesforce kraakt grote techbedrijven Tags: Big Data, Blog, Davos, Google, Technologie User User Profile Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen Stylometrie: CharLen=3491, WordLen=635 Entities: Qualcomm, Apple, NXP, Intel, Google
  • 84. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 27 Qualcomm krijgt bijna €1 mrd boete van Brussel Topman van softwaremaker Salesforce kraakt grote techbedrijven Tags: Big Data, Blog, Davos, Google, Technologie Rubriek: Davos User User Profile Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen Stylometrie: CharLen=3491, WordLen=635 Entities: Qualcomm, Apple, NXP, Intel, Google
  • 85. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 27 Qualcomm krijgt bijna €1 mrd boete van Brussel Topman van softwaremaker Salesforce kraakt grote techbedrijven Tags: Big Data, Blog, Davos, Google, Technologie Rubriek: Davos Stylometrie: CharLen=2856, WordLen=524 User User Profile Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen Stylometrie: CharLen=3491, WordLen=635 Entities: Qualcomm, Apple, NXP, Intel, Google
  • 86. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 27 Qualcomm krijgt bijna €1 mrd boete van Brussel Topman van softwaremaker Salesforce kraakt grote techbedrijven Tags: Big Data, Blog, Davos, Google, Technologie Rubriek: Davos Stylometrie: CharLen=2856, WordLen=524 Entities: Google, Apple, Microsoft, Salesforce User User Profile Tags: Boete, Chips, EU, Mededinging Rubriek: Ondernemen Stylometrie: CharLen=3491, WordLen=635 Entities: Qualcomm, Apple, NXP, Intel, Google
  • 87. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 27 Qualcomm krijgt bijna €1 mrd boete van Brussel Tags: Boete, Chips, EU, Mededinging, Big Data, Blog, Davos, Google, Technologie Rubriek: Ondernemen, Davos Stylometrie: CharLen=3491, WordLen=635, CharLen=2856, WordLen=524 Entities: Qualcomm, Apple (2), NXP, Intel, Google (2), Microsoft, Salesforce Topman van softwaremaker Salesforce kraakt grote techbedrijven Tags: Big Data, Blog, Davos, Google, Technologie Rubriek: Davos Stylometrie: CharLen=2856, WordLen=524 Entities: Google, Apple, Microsoft, Salesforce User User Profile
  • 88. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Model • Content-based RecSys • Ranking w/ point-wise LTR • Features: user, article, user-article features (~14k) • Labels: implicit feedback • Clicks (i.e., click = 1, non-click = 0) • Trained nightly 28
  • 89. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Bias? • “Disproportionate weight in favor of or against an idea or thing, usually in a way that is closed-minded, prejudicial, or unfair.” 29
  • 90. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Bias in RecSys “Algorithmic” I. In Collaborative Filtering methods II. In implicit feedback/clicks 30
  • 91. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Collaborative Filtering 31
  • 92. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Collaborative Filtering 31
  • 93. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Bias in CF 32
  • 94. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 [1.] Park & Tuzhilin. The Long Tail of Recommender Systems and How to Leverage It (RecSys ’08) • It is more difficult to predict ratings of infrequently rated items in Collaborative Filtering Bias in CF 32
  • 95. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 [1.] Park & Tuzhilin. The Long Tail of Recommender Systems and How to Leverage It (RecSys ’08) • It is more difficult to predict ratings of infrequently rated items in Collaborative Filtering Bias in CF 32
  • 96. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 [1.] Park & Tuzhilin. The Long Tail of Recommender Systems and How to Leverage It (RecSys ’08) • It is more difficult to predict ratings of infrequently rated items in Collaborative Filtering • Bias: disproportionate weight in favor of popular items Bias in CF 32
  • 97. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 [1.] Park & Tuzhilin. The Long Tail of Recommender Systems and How to Leverage It (RecSys ’08) [2.] Meyer, F. Recommender systems in industrial contexts (2012) • It is more difficult to predict ratings of infrequently rated items in Collaborative Filtering • Bias: disproportionate weight in favor of popular items • “It is generally not useful to recommend very popular items as they are generally already known by the user” [2] Bias in CF 32
  • 98. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 [1.] Park & Tuzhilin. The Long Tail of Recommender Systems and How to Leverage It (RecSys ’08) [2.] Meyer, F. Recommender systems in industrial contexts (2012) [3.] Abdollahpouri et al., The Unfairness of Popularity Bias in Recommendation, RMSE@RecSys ’19 • It is more difficult to predict ratings of infrequently rated items in Collaborative Filtering • Bias: disproportionate weight in favor of popular items • “It is generally not useful to recommend very popular items as they are generally already known by the user” [2] • “A market that suffers from popularity bias will lack opportunities to discover more obscure products and will be, by definition, dominated by a few large brands […]” [3] Bias in CF 32
  • 99. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 [1.] Park & Tuzhilin. The Long Tail of Recommender Systems and How to Leverage It (RecSys ’08) [2.] Meyer, F. Recommender systems in industrial contexts (2012) [3.] Abdollahpouri et al., The Unfairness of Popularity Bias in Recommendation, RMSE@RecSys ’19 • It is more difficult to predict ratings of infrequently rated items in Collaborative Filtering • Bias: disproportionate weight in favor of popular items • “It is generally not useful to recommend very popular items as they are generally already known by the user” [2] • “A market that suffers from popularity bias will lack opportunities to discover more obscure products and will be, by definition, dominated by a few large brands […]” [3] • Solution: cluster long-tail items Bias in CF 32
  • 100. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 [1.] Park & Tuzhilin. The Long Tail of Recommender Systems and How to Leverage It (RecSys ’08) [2.] Meyer, F. Recommender systems in industrial contexts (2012) [3.] Abdollahpouri et al., The Unfairness of Popularity Bias in Recommendation, RMSE@RecSys ’19 • It is more difficult to predict ratings of infrequently rated items in Collaborative Filtering • Bias: disproportionate weight in favor of popular items • “It is generally not useful to recommend very popular items as they are generally already known by the user” [2] • “A market that suffers from popularity bias will lack opportunities to discover more obscure products and will be, by definition, dominated by a few large brands […]” [3] • Solution: cluster long-tail items Bias in CF 32
  • 101. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Bias in implicit feedback 33 Joachims et al., Accurately Interpreting Clickthrough Data as Implicit Feedback (SIGIR ’05)
  • 102. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Bias in implicit feedback • Popular items are overrepresented in implicit feedback 33 Joachims et al., Accurately Interpreting Clickthrough Data as Implicit Feedback (SIGIR ’05)
  • 103. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Bias in implicit feedback • Popular items are overrepresented in implicit feedback • Position/“trust" bias (see Joachims et al., 2005) • Eye-tracking study + comparison w/ explicit feedback shows; • Clicks reflect relevance judgments • Clicks ranked highly receive more clicks 33 Joachims et al., Accurately Interpreting Clickthrough Data as Implicit Feedback (SIGIR ’05)
  • 104. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Bias in implicit feedback • Popular items are overrepresented in implicit feedback • Position/“trust" bias (see Joachims et al., 2005) • Eye-tracking study + comparison w/ explicit feedback shows; • Clicks reflect relevance judgments • Clicks ranked highly receive more clicks 33 Joachims et al., Accurately Interpreting Clickthrough Data as Implicit Feedback (SIGIR ’05)
  • 105. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Perceived Bias from RecSys 34
  • 106. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Perceived Bias from RecSys • A state of intellectual isolation that 
 allegedly can result from personalized 
 searches when a website algorithm 
 selectively guesses what information a 
 user would like to see based on 
 information about the user. • As a result, users become separated 
 from information that disagrees with 
 their viewpoints. 34
  • 107. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Measuring personalization 35
  • 108. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Measuring personalization 35
  • 109. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Measuring personalization • On average, 11.7% of results show differences due to personalization on Google. • Varies widely by search query and by result ranking. • Only found measurable personalization as a result of searching with a logged in account and the IP address of the searching user. 35
  • 110. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method 36[Hannák et al., 2013]
  • 111. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method 1. 👤 36[Hannák et al., 2013]
  • 112. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method 1. 👤 1. Get 200 volunteers with Google accounts 36[Hannák et al., 2013]
  • 113. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method 1. 👤 1. Get 200 volunteers with Google accounts 2. Have them issue the same set of queries 36[Hannák et al., 2013]
  • 114. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method 1. 👤 1. Get 200 volunteers with Google accounts 2. Have them issue the same set of queries 3. Compare results 36[Hannák et al., 2013]
  • 115. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method 1. 👤 1. Get 200 volunteers with Google accounts 2. Have them issue the same set of queries 3. Compare results 2. 🤖 36[Hannák et al., 2013]
  • 116. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method 1. 👤 1. Get 200 volunteers with Google accounts 2. Have them issue the same set of queries 3. Compare results 2. 🤖 1. Construct Google bot accounts 36[Hannák et al., 2013]
  • 117. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method 1. 👤 1. Get 200 volunteers with Google accounts 2. Have them issue the same set of queries 3. Compare results 2. 🤖 1. Construct Google bot accounts • Vary aspects such as location, demographics, click behavior, browsing + search history, etc. 36[Hannák et al., 2013]
  • 118. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method 1. 👤 1. Get 200 volunteers with Google accounts 2. Have them issue the same set of queries 3. Compare results 2. 🤖 1. Construct Google bot accounts • Vary aspects such as location, demographics, click behavior, browsing + search history, etc. 2. Have them issue the same set of queries 36[Hannák et al., 2013]
  • 119. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method 1. 👤 1. Get 200 volunteers with Google accounts 2. Have them issue the same set of queries 3. Compare results 2. 🤖 1. Construct Google bot accounts • Vary aspects such as location, demographics, click behavior, browsing + search history, etc. 2. Have them issue the same set of queries 3. Compare results 36[Hannák et al., 2013]
  • 120. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 👤 Findings 37[Hannák et al., 2013]
  • 121. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 👤 Findings • On average, 11.7% of results show differences due to personalization on Google. 37[Hannák et al., 2013]
  • 122. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 👤 Findings • On average, 11.7% of results show differences due to personalization on Google. • Top ranks tend to be less personalized than bottom ranks. 37[Hannák et al., 2013]
  • 123. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 👤 Findings 38[Hannák et al., 2013]
  • 124. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 👤 Findings • ✅ Personalization based on location (e.g., company names) 38[Hannák et al., 2013]
  • 125. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 👤 Findings • ✅ Personalization based on location (e.g., company names) • ❌ The least personalized results tend to be factual and health related queries. 38[Hannák et al., 2013]
  • 126. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 🤖 Findings 39[Hannák et al., 2013]
  • 127. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 🤖 Findings ✅ Logged in vs. “cleared cookies” account 39[Hannák et al., 2013]
  • 128. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 🤖 Findings ✅ Logged in vs. “cleared cookies” account ✅ Geolocation 39[Hannák et al., 2013]
  • 129. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 🤖 Findings ✅ Logged in vs. “cleared cookies” account ✅ Geolocation ❌ Gender 39[Hannák et al., 2013]
  • 130. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 🤖 Findings ✅ Logged in vs. “cleared cookies” account ✅ Geolocation ❌ Gender ❌ Age 39[Hannák et al., 2013]
  • 131. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 🤖 Findings ✅ Logged in vs. “cleared cookies” account ✅ Geolocation ❌ Gender ❌ Age ❌ Search history 39[Hannák et al., 2013]
  • 132. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 🤖 Findings ✅ Logged in vs. “cleared cookies” account ✅ Geolocation ❌ Gender ❌ Age ❌ Search history ❌ Click history 39[Hannák et al., 2013]
  • 133. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 🤖 Findings ✅ Logged in vs. “cleared cookies” account ✅ Geolocation ❌ Gender ❌ Age ❌ Search history ❌ Click history ❌ Browsing history 39[Hannák et al., 2013]
  • 134. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Diversity to pop the filter bubble 40
  • 135. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Diversity to pop the filter bubble 40
  • 136. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method 41[Nguyen et al., 2014]
  • 137. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method • Split MovieLens users into two groups: 41[Nguyen et al., 2014]
  • 138. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method • Split MovieLens users into two groups: • “Followers”: users who rated movies they were recommended 41[Nguyen et al., 2014]
  • 139. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method • Split MovieLens users into two groups: • “Followers”: users who rated movies they were recommended • “Ignorers”: users who rated movies they were not recommended 41[Nguyen et al., 2014]
  • 140. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method • Split MovieLens users into two groups: • “Followers”: users who rated movies they were recommended • “Ignorers”: users who rated movies they were not recommended • Compare between groups, over time: 41[Nguyen et al., 2014]
  • 141. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method • Split MovieLens users into two groups: • “Followers”: users who rated movies they were recommended • “Ignorers”: users who rated movies they were not recommended • Compare between groups, over time: • Diversity of recommendations 41[Nguyen et al., 2014]
  • 142. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method • Split MovieLens users into two groups: • “Followers”: users who rated movies they were recommended • “Ignorers”: users who rated movies they were not recommended • Compare between groups, over time: • Diversity of recommendations • Ratings of movies 41[Nguyen et al., 2014]
  • 143. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings 42[Nguyen et al., 2014]
  • 144. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings 1. Diversity 42[Nguyen et al., 2014]
  • 145. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings 1. Diversity • In both groups, diversity decreases over time. 42[Nguyen et al., 2014]
  • 146. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings 1. Diversity • In both groups, diversity decreases over time. • The effect is lessened for users who consume recommended items (followers) 42[Nguyen et al., 2014]
  • 147. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings 1. Diversity • In both groups, diversity decreases over time. • The effect is lessened for users who consume recommended items (followers) 2. Ratings 42[Nguyen et al., 2014]
  • 148. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings 1. Diversity • In both groups, diversity decreases over time. • The effect is lessened for users who consume recommended items (followers) 2. Ratings • Slight decrease in average ratings for ignorers (3.74 to 3.55). 42[Nguyen et al., 2014]
  • 149. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings 1. Diversity • In both groups, diversity decreases over time. • The effect is lessened for users who consume recommended items (followers) 2. Ratings • Slight decrease in average ratings for ignorers (3.74 to 3.55). • Stable average ratings for followers (~3.68). 42[Nguyen et al., 2014]
  • 150. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Diversity in RecSys 🤖 vs. humans 👤? 43
  • 151. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Diversity in RecSys 🤖 vs. humans 👤? 43
  • 152. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method 44[Möller et al. 2018]
  • 153. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method • 🤖 Generate article recommendations for news articles using different RecSys algorithms (CF & CB). 44[Möller et al. 2018]
  • 154. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method • 🤖 Generate article recommendations for news articles using different RecSys algorithms (CF & CB). • 👤 Compare to hand-picked article recommendations. 44[Möller et al. 2018]
  • 155. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method • 🤖 Generate article recommendations for news articles using different RecSys algorithms (CF & CB). • 👤 Compare to hand-picked article recommendations. • Measure & compare “diversity” of recommended articles: • At content level • At tag level • At category level • At sentiment/subjectivity level 44[Möller et al. 2018]
  • 156. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings 45[Möller et al. 2018]
  • 157. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings 45[Möller et al. 2018]
  • 158. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings 45[Möller et al. 2018]
  • 159. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings 45[Möller et al. 2018]
  • 160. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings 45[Möller et al. 2018]
  • 161. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings 45[Möller et al. 2018]
  • 162. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings “Conventional recommendation algorithms at least preserve the topic/sentiment diversity of the article supply.” 45[Möller et al. 2018]
  • 163. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 More diversity 46
  • 164. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 More diversity 46
  • 165. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Aim Increase exposure to varied political opinions 
 with a goal of improving civil discourse 47[Yom-Tov et al. 2014]
  • 166. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method • Classify searchers into political leaning (using geo data) 48[Yom-Tov et al. 2014]
  • 167. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method 49[Yom-Tov et al. 2014]
  • 168. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method • Infer political leaning of news sources from user behavior. 49[Yom-Tov et al. 2014]
  • 169. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method • Infer political leaning of news sources from user behavior. 49[Yom-Tov et al. 2014]
  • 170. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method • Infer political leaning of news sources from user behavior. 49[Yom-Tov et al. 2014]
  • 171. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method • Infer political leaning of news sources from user behavior. • Identify polarized search queries (with strong political leanings — in both directions). 49[Yom-Tov et al. 2014]
  • 172. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method 50[Yom-Tov et al. 2014]
  • 173. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method • Treatment group: Insert red results for blue users, and blue results for red users 50[Yom-Tov et al. 2014]
  • 174. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method • Treatment group: Insert red results for blue users, and blue results for red users • Control group: Do not adjust results 50[Yom-Tov et al. 2014]
  • 175. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method 51
  • 176. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method 1. Short term: Compare clicks/behavior between control & treatment. 51
  • 177. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method 1. Short term: Compare clicks/behavior between control & treatment. 2. Long term: Measure during two weeks, per user; 51
  • 178. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method 1. Short term: Compare clicks/behavior between control & treatment. 2. Long term: Measure during two weeks, per user; 1. Polarization: Difference of user’s leaning-score compared to average leaning across all sources. 51
  • 179. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Method 1. Short term: Compare clicks/behavior between control & treatment. 2. Long term: Measure during two weeks, per user; 1. Polarization: Difference of user’s leaning-score compared to average leaning across all sources. 2. Engagement: Average number of queries + average read articles. 51
  • 180. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings I 52[Yom-Tov et al. 2014]
  • 181. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings I • Less clicks on inserted opposing sources. 52[Yom-Tov et al. 2014]
  • 182. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings I • Less clicks on inserted opposing sources. • But: 
 “Results pages of the opposing viewpoint which had a similarity higher than the average tended to be clicked 38% more than those below the average.” 52[Yom-Tov et al. 2014]
  • 183. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings II 53[Yom-Tov et al. 2014]
  • 184. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings II • Polarization: 53[Yom-Tov et al. 2014]
  • 185. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings II • Polarization: • Treatment: Average leaning ‘moves’ ~25% to centre 53[Yom-Tov et al. 2014]
  • 186. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings II • Polarization: • Treatment: Average leaning ‘moves’ ~25% to centre • Control: Negligible difference (~1%) 53[Yom-Tov et al. 2014]
  • 187. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings II • Polarization: • Treatment: Average leaning ‘moves’ ~25% to centre • Control: Negligible difference (~1%) • Engagement: 53[Yom-Tov et al. 2014]
  • 188. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings II • Polarization: • Treatment: Average leaning ‘moves’ ~25% to centre • Control: Negligible difference (~1%) • Engagement: • Treatment: Number of queries: +9% / articles read: +4% 53[Yom-Tov et al. 2014]
  • 189. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Findings II • Polarization: • Treatment: Average leaning ‘moves’ ~25% to centre • Control: Negligible difference (~1%) • Engagement: • Treatment: Number of queries: +9% / articles read: +4% • Control: Small reduction in both (~2.5%) 53[Yom-Tov et al. 2014]
  • 190. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 Refs Algorithmic bias 1. Park & Tuzhilin, The Long Tail of Recommender Systems and How to Leverage It (RecSys ’08) 2. Meyer, Recommender systems in industrial contexts (2012) 3. Abdollahpouri et al., The Unfairness of Popularity Bias in Recommendation (RMSE@RecSys ’19) 4. Joachims et al., Accurately Interpreting Clickthrough Data as Implicit Feedback (SIGIR ’05) Perceived bias / filter bubbles 5. Hannak et al., Measuring personalization of web search (WWW ’13) 6. Nguyen et al., Exploring the filter bubble: the effect of using recommender systems on content diversity (WWW ’14) 7. Möller et al., Do not blame it on the algorithm — An empirical assessment of multiple recommender systems and their impact on content diversity (Information Communication and Society ’18) 8. Yom-Tov et al., Promoting Civil Discourse Through Search Engine Diversity (Social Science Computer Review, ’13) 54
  • 191. David Graus • SIKS Course: Advances in Information Retrieval • 08/10/2019 55