Do not blame it on the algorithm - An empirical assessmentof multiple recommender systems and their impacton content diversity
1. Do not blame it on the algorithm - An
empirical assessment
of multiple recommender systems and their
impact
on content diversity
Möller, Trilling, Helberger, van Es
The Amsterdam Personalised Communication Project
3. News recommendation algorithms
Optimize sets of recommended news articles by sorting
them according to
• Popularity
• Collaborative filtering
• Semantic match
3
5. Democratic
participation
Ratio of
news/public
affairs
Different
political/ideo
logical
viewpoints
Tolerance
Ratio of
content
dedicated to
ethnic,
linguistic,
national
groups
Different
languages
Autonomy
Choice
between
different
formats,
topics,
genres,
sources
Little
variance, in
the sense of
distance
from
personal
preferences
Deliberation
Reconcillary
tone, share
of articles
presenting
various
perspectives
Diversity of
emotions,
range of
different
authors,
sources
Contestation
Minority
voices
Share of
content that
is
purposefully
biased
5
6. Democratic
participation
Ratio of
news/public
affairs
Different
political/ideo
logical
viewpoints
Tolerance
Ratio of
content
dedicated to
ethnic,
linguistic,
national
groups
Different
languages
Autonomy
Choice
between
different
formats,
topics,
genres,
sources
Little
variance, in
the sense of
distance
from
personal
preferences
Deliberation
Reconcillary
tone, share
of articles
presenting
various
perspectives
Diversity of
emotions,
range of
different
authors,
sources
Contestation
Minority
voices
Share of
content that
is
purposefully
biased
6
7. Research question
Which algorithmic setting in news recommender system
produces the largest and smallest amount of diversity on
the different dimensions of diversity?
7
8. Methods
Data: 1000 Simulated recommendation sets in different
algorithmic settings based of data on Volkskrant.nl published
between 19.9.2016 and 26.9 2016, N=21,973 articles
Benchmark: Recommendation by the human editor
8
10. Methods
Operationalization (Preliminary): Diversity for democratic
participation: Overlap in topic; Diversity for autonomy:
Overlap in section of the newspaper, Diversity for delibartion:
Overlap in tone and emotions
10
11. How to measure overlap
Naïve approach:
(1) map: same feature (topic) 1; different feature (topic) 0
(2) sum
In a recommendation set of three articles, we can get the scores
{0; 1; 2; 3}
Example: An article about sport leads to the recommendation set
{politics; sport; sport} score 2
11
13. How to measure overlap
But this naïve approach oversimplifies.
Better: Instead of binary [0/1] have [0;1] interval
Distance between topics as feature; not presence as feature
13
16. (1) Use MDS to represent each topic t by coordinates (x,y) in
two-dimensional space
(2) Each document D is represented by vector
of topic weights
3) Calculate topic dissimilarity matrix M (50x50)
16
Preparing the matrices
17. (1) We multiply M with a and , for document 1 and 2 ,
resulting in two matrices:
These represent the topic dissimilarity matrices weighed by
the topic occurrence in the document in question.
(2) We can then calculate the Euclidian distance between
the two matrices using the Frobenius norm:
17
Comparing two documents
18. Apply to recommendation sets
Given that each document in the dataset generated three
recommendations, we propose to calculate several
measures:
the mean of the distance of each article in the
reommendation set with the original article.
the mean of the distances within the recommendation set
18
23. Thanks
Want to know more?
@judith_moeller, @damian0604
Personalised-communication.net
Notas do Editor
Functional approach: diversity is not a goal in itself but a means to and end, namely the realisation of particular values and objectives. Which values and objectives those are depends, among other, on the conceptualisation of pluralism. Arguably, there is not one conceptualisation of pluralism but several possible conceptual (liberal pluralism, deliberate pluralism, cultural pluralism, radical pluralism).
Seeing that media diversity is not a value in itself but a means to achieve a particular goal, it is probably too restricted to simply measure the diversity of an information offer ipso jure. Instead, we should take into account the way people find, collect, process, engage and act upon diverse information. In other words, a complete vision of diversity also looks into the presentation (if the ways of presentation/style of writing/etc) reflect the heterogeneity for different styles, etc., , the visibility and ease of access of diverse content (including geographical access), the engagement with the content (in the form of sharing, liking, hyperlinking, commenting, etc.), etc.
Pluralism and diversity can be measured in two ways: absolute (e.g. the number of news papers from different sources) and proportionate/relational (the ratio of news papers from one source in relationship to the number of news papers from another source). Typically, it is not a piece of content that is diverse, but how this content relates to other contents.
Red circle is what was recommended to one specific user,
How to optimize the recommender, mover users towards norm (0), and you want to make the circle larger