SlideShare uma empresa Scribd logo
1 de 10
Baixar para ler offline
Exploring Author Gender
in Book Rating and Recommendation
M. D. Ekstrand et al.
1
2






3
4
RecSys ’18, October 2–7, 2018, Vancouver, BC, Canada
u
unu
µ
¯ua ua
¯ua ¯nua
ba
sa a
u 2 U
a 2 A
u u a
5
RecSys ’18, October 2–7, 2018, Vancouver, BC, Can
u
unu
µ
¯ua
¯ua
ba
sa
u 2 U
Binomial(nu, θu)NegBinomial(ν, γ)
logit(θu) Normal(μ, σ)
6
ober 2–7, 2018, Vancouver, BC, Canada
u
u
µ
¯ua ua
¯ua ¯nua
ba
sa a
a 2 A
Table
Variab
n
¯nu
¯u
logit( ) Normal( + logit( ), 2)<latexit sha1_base64="WiSy2qnMJnJn/Jh+eBgx0ac955E=">AAADb3ichVLPaxNBFP6a1Vrrj8Z6UBBkMEQSlDIJQqWnohe9SJs0baBbl9l1mizdX+xMQuuy/4AXjx68qOBB/DO8ePLmoX+CeJIW9GDFt5sNakvrLDvz5pv3fe/Nm2dHnqs057sTJePU6ckzU2enz52/cHGmfGl2VYWD2JEdJ/TCuGsLJT03kB3tak92o1gK3/bkmr11PztfG8pYuWGwonciueGLXuBuuo7QBFnlvukL3Y/9xAt7rk5rpi3ixNR9qUVqJQOR1pmpXJ+N3R6FsS+8tGZbgt1iiubDAjnXGtRvs4SYPV9YIn3crFvlCp/j+WBHjUZhVFCMpbD8AyaeIISDAXxIBNBkexBQ9K2jAY6IsA0khMVkufm5RIpp4g7IS5KHIHSL5h7t1gs0oH2mqXK2Q1E8+mNiMlT5Z/6O7/GP/D3/wn8eq5XkGlkuO7TaI66MrJlnV9vf/8vyadXo/2GdmLPGJu7mubqUe5Qj2S2cEX/49MVee6FVTW7yN/wr5f+a7/IPdINguO+8XZatl6ReBXt+0HrVmjwhUkBVyJS3i7qqvKrbaI7i0D/Om1FGYf4mC2S3sYKH6P6FHn//scK4blntVfZm1CKNww1x1FhtzjXIXr5TWbxXNMsUruEGatQR81jEAyyhQxE/YR8H+FX6Zlwxrhts5FqaKDiX8c8w6r8BoCnTFQ==</latexit><latexit sha1_base64="WiSy2qnMJnJn/Jh+eBgx0ac955E=">AAADb3ichVLPaxNBFP6a1Vrrj8Z6UBBkMEQSlDIJQqWnohe9SJs0baBbl9l1mizdX+xMQuuy/4AXjx68qOBB/DO8ePLmoX+CeJIW9GDFt5sNakvrLDvz5pv3fe/Nm2dHnqs057sTJePU6ckzU2enz52/cHGmfGl2VYWD2JEdJ/TCuGsLJT03kB3tak92o1gK3/bkmr11PztfG8pYuWGwonciueGLXuBuuo7QBFnlvukL3Y/9xAt7rk5rpi3ixNR9qUVqJQOR1pmpXJ+N3R6FsS+8tGZbgt1iiubDAjnXGtRvs4SYPV9YIn3crFvlCp/j+WBHjUZhVFCMpbD8AyaeIISDAXxIBNBkexBQ9K2jAY6IsA0khMVkufm5RIpp4g7IS5KHIHSL5h7t1gs0oH2mqXK2Q1E8+mNiMlT5Z/6O7/GP/D3/wn8eq5XkGlkuO7TaI66MrJlnV9vf/8vyadXo/2GdmLPGJu7mubqUe5Qj2S2cEX/49MVee6FVTW7yN/wr5f+a7/IPdINguO+8XZatl6ReBXt+0HrVmjwhUkBVyJS3i7qqvKrbaI7i0D/Om1FGYf4mC2S3sYKH6P6FHn//scK4blntVfZm1CKNww1x1FhtzjXIXr5TWbxXNMsUruEGatQR81jEAyyhQxE/YR8H+FX6Zlwxrhts5FqaKDiX8c8w6r8BoCnTFQ==</latexit><latexit sha1_base64="WiSy2qnMJnJn/Jh+eBgx0ac955E=">AAADb3ichVLPaxNBFP6a1Vrrj8Z6UBBkMEQSlDIJQqWnohe9SJs0baBbl9l1mizdX+xMQuuy/4AXjx68qOBB/DO8ePLmoX+CeJIW9GDFt5sNakvrLDvz5pv3fe/Nm2dHnqs057sTJePU6ckzU2enz52/cHGmfGl2VYWD2JEdJ/TCuGsLJT03kB3tak92o1gK3/bkmr11PztfG8pYuWGwonciueGLXuBuuo7QBFnlvukL3Y/9xAt7rk5rpi3ixNR9qUVqJQOR1pmpXJ+N3R6FsS+8tGZbgt1iiubDAjnXGtRvs4SYPV9YIn3crFvlCp/j+WBHjUZhVFCMpbD8AyaeIISDAXxIBNBkexBQ9K2jAY6IsA0khMVkufm5RIpp4g7IS5KHIHSL5h7t1gs0oH2mqXK2Q1E8+mNiMlT5Z/6O7/GP/D3/wn8eq5XkGlkuO7TaI66MrJlnV9vf/8vyadXo/2GdmLPGJu7mubqUe5Qj2S2cEX/49MVee6FVTW7yN/wr5f+a7/IPdINguO+8XZatl6ReBXt+0HrVmjwhUkBVyJS3i7qqvKrbaI7i0D/Om1FGYf4mC2S3sYKH6P6FHn//scK4blntVfZm1CKNww1x1FhtzjXIXr5TWbxXNMsUruEGatQR81jEAyyhQxE/YR8H+FX6Zlwxrhts5FqaKDiX8c8w6r8BoCnTFQ==</latexit><latexit sha1_base64="WiSy2qnMJnJn/Jh+eBgx0ac955E=">AAADb3ichVLPaxNBFP6a1Vrrj8Z6UBBkMEQSlDIJQqWnohe9SJs0baBbl9l1mizdX+xMQuuy/4AXjx68qOBB/DO8ePLmoX+CeJIW9GDFt5sNakvrLDvz5pv3fe/Nm2dHnqs057sTJePU6ckzU2enz52/cHGmfGl2VYWD2JEdJ/TCuGsLJT03kB3tak92o1gK3/bkmr11PztfG8pYuWGwonciueGLXuBuuo7QBFnlvukL3Y/9xAt7rk5rpi3ixNR9qUVqJQOR1pmpXJ+N3R6FsS+8tGZbgt1iiubDAjnXGtRvs4SYPV9YIn3crFvlCp/j+WBHjUZhVFCMpbD8AyaeIISDAXxIBNBkexBQ9K2jAY6IsA0khMVkufm5RIpp4g7IS5KHIHSL5h7t1gs0oH2mqXK2Q1E8+mNiMlT5Z/6O7/GP/D3/wn8eq5XkGlkuO7TaI66MrJlnV9vf/8vyadXo/2GdmLPGJu7mubqUe5Qj2S2cEX/49MVee6FVTW7yN/wr5f+a7/IPdINguO+8XZatl6ReBXt+0HrVmjwhUkBVyJS3i7qqvKrbaI7i0D/Om1FGYf4mC2S3sYKH6P6FHn//scK4blntVfZm1CKNww1x1FhtzjXIXr5TWbxXNMsUruEGatQR81jEAyyhQxE/YR8H+FX6Zlwxrhts5FqaKDiX8c8w6r8BoCnTFQ==</latexit>
7
btain author information from
(VIAF)3, a directory of author
ity records from the Library of
und the world. Author gender
s for many records.
mployed by the VIAF is exible
ender identities, supporting an
es for the validity of an identity.
se exibility — all its assertions
This is a signicant limitation
on 5.1.
book data with rating data by
ve data linking coverage, and
works instead of individual edi-
m a bipartite graph of ISBNs and
“edition” records, and OpenLi-
e) and consider each connected
ess than 1% of ratings) this caus-
or a book; we resolve multiple
ir ratings.
VIAF do not share linking iden-
hority records by author name.
ontain multiple name entries,
izations of the author’s name.
arry multiple known forms of
ng names to improve matching
ng both “Last, First” and “First
e all VIAF records containing a
d names for the rst author of
n a book’s cluster. If all records
hor’s gender agree, we take that
ontradicting gender statements,
as “ambiguous”.
ure good coverage while main-
Table 2: Summary of rating data
BookCrossing Amazon
Ratings 1,149,780 22,507,155
Users 105,283 8,026,324
Rated ISBNs/ASINs 340,554 2,330,066
Rated ‘Books’ 295,935 2,286,656
Matched Books 240,255 1,083,066
Known-Gender Books 166,928 616,317
Female-Author Books 66,524 181,850
Male-Author Books 100,404 434,467
% Female Books 39.9% 29.5%
% Female Ratings 45.3% 36.2%
BXA BXE
LOC AZ
fem
ale
m
ale
am
biguousunknow
nunlinked
fem
ale
m
ale
am
biguousunknow
nunlinked
0%
20%
40%
60%
0%
10%
20%
30%
40%
0%
10%
20%
30%
40%
0%
10%
20%
30%
40%
Linking Result
CoveragePercent
Scope
Books
Ratings
Figure 1: Results of data linking and gender resolution. LOC
is the set of books with Library of Congress records; other
panes are the results of linking rating data.
8
dependent
TAN 2.17.3
each per-
We report
arameters
h existing
acterizing
nalyze the
Tables 1–
sample of
nders are
in our cat-
has a more
ookCross-
wn-gender
oportions
(est. sd log odds) 1.03 1.11 1.77
Posterior Mean 0.42 0.40 0.37
Std. Dev. 0.23 0.23 0.28
AZBXABXE
0.00 0.25 0.50 0.75 1.00
0
1
2
3
4
0.0
0.5
1.0
1.5
2.0
0.0
0.5
1.0
1.5
2.0
Proportion of Female Authors
Density
Method Estimated θ Observed y/n Predicted y/n
Figure 4: Distribution of user author-gender tendencies. His-
togram shows observed proportions; lines show kernel den-
sities of estimated tendencies ( 0) along with observed and
predicted proportions.
and Figure 4 shows the distribution of observed author gender
9
Users Dist. Items % Dist. Users Dist. Items % Dist. Users Dist. Items % Dist. Users Dist. Items % Dist.
Prole 1,000 35,187 66.5 1,000 24,913 73.6 1,000 27,525 88.2 1,000 27,525 88.2
UserUser 1,000 6,007 12.0 988 6,235 12.7 1,000 15,343 30.7 939 25,853 55.1
ItemItem 1,000 21,282 42.6 997 10,174 20.4 999 33,363 67.7 999 22,360 45.6
MF 1,000 140 0.3 1,000 264 0.5 1,000 164 0.3 1,000 651 1.3
PF 1,000 1,506 3.0 1,000 4,105 8.2 1,000 2,746 5.4 1,000 3,538 7.0
AZ (Explicit) AZ (Implicit) BXA BXE
UserUserItemItemMFPF
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
0
1
2
3
0
1
2
3
4
0
10
20
0
1
2
3
4
Proportion of Books by Female Authors
Density
Mean
Algorithm
Popular
Profile
Method
Observed
Predicted
Figure 5: Posterior densities of recommender biases from integrated regression model.
proportions. The ripples in predicted and observed proportions are
due to the commonality of 5-item user proles, for which there are
only 6 possible proportions; estimated tendency ( ) smooths them
out. This smoothing, along with avoiding estimated extreme biases
based on limited data, are why we nd it useful to estimate tenden-
cy instead of directly computing statistics on observed proportions.
To support direct comparison of the densities of observations and
predictions, we resampled observed proportions with replacement
to yield 10,000 observations.
We observe a population tendency to rate male authors more
frequently than female authors in all data sets (µ  0), but to rate
female authors more frequently than they would be rated were
users drawing books uniformly at random from the available set.
The average user author-gender tendency is slightly closer to an
even balance than the set of rated books. We also found a large
diversity amongst users about their estimated tendencies (s.d. of
Table 6: Mean / SD of rec. list female author proportions.
BXA BXE AZ (Implicit) AZ (Explicit)
Popular 0.458 0.500 0.364 0.364
Rating — 0.383 — 0.222
UserUser 0.399 / 0.180 0.435 / 0.190 0.315 / 0.186 0.367 / 0.278
ItemItem 0.465 / 0.200 0.348 / 0.124 0.351 / 0.245 0.389 / 0.336
MF 0.134 / 0.027 0.334 / 0.039 0.468 / 0.079 0.418 / 0.124
PF 0.372 / 0.208 0.429 / 0.177 0.374 / 0.144 0.394 / 0.177
basic coverage statistics of these algorithms along with correspond-
ing user prole statistics. Users for which an algorithm could not
produce recommendations are rare. We also computed the extent
to which algorithms recommend dierent items to dierent users;
“% Dist.” is the percentage of all recommendations that were distinct
items. Algorithms that repeatedly recommend the same items will
10
BXE
-0.139 0.162 0.906 -0.573 0.129 0.531 -0.652 0.002 0.161 -0.166 0.298 0.772
(-0.20,-0.08) (0.10,0.22) (0.87,0.95) (-0.61,-0.54) (0.09,0.16) (0.51,0.56) (-0.66,-0.64) (-0.01,0.01) (0.15,0.17) (-0.22,-0.11) (0.25,0.35) (0.74,0.81)
AZ (Implicit)
-0.127 0.688 0.715 0.094 0.863 0.895 -0.244 0.011 0.364 -0.224 0.287 0.537
(-0.19,-0.06) (0.65,0.73) (0.68,0.76) (0.02,0.17) (0.81,0.92) (0.84,0.95) (-0.27,-0.22) (-0.00,0.02) (0.35,0.38) (-0.26,-0.18) (0.26,0.31) (0.51,0.56)
AZ (Explicit)
-0.580 0.322 0.681 -0.380 0.438 0.852 -0.117 0.006 0.273 -0.403 0.141 0.525
(-0.63,-0.53) (0.29,0.35) (0.65,0.71) (-0.44,-0.32) (0.40,0.48) (0.81,0.89) (-0.14,-0.10) (-0.00,0.02) (0.26,0.29) (-0.44,-0.37) (0.12,0.16) (0.50,0.55)
AZ (Explicit) AZ (Implicit) BXA BXE
UserUserItemItemMFPF
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
0.00
0.25
0.50
0.75
1.00
0.00
0.25
0.50
0.75
1.00
0.00
0.25
0.50
0.75
1.00
0.00
0.25
0.50
0.75
1.00
Profile Proportion of Female Authors
RecommenderProportionofFemaleAuthors
Figure 6: Scatter plots and regression curves for recommender response to individual users.
more concentrated. In the BookCrossing data, it tends to favor male
authors more than the underlying data would support; in implic-
it feedback mode, it is highly biased towards male authors with
respect even to the baseline distributions.
4.4 From Proles to Recommendations
Our extended Bayesian model (Section 3.4.2) allows us to address
RQ4: the extent to which our algorithms propagate individual users’
tendencies into their recommendations (RQ4).
Figure 5 shows the posterior predictive and observed densities
of recommender author-gender tendencies, and Figure 6 shows
scatter plots of observed recommendation proportions against user
prole proportions with regression curves (regression lines in log-
place. Visual inspection of the scatter plot suggests that there is a
strong component with consistent tendencies, but the regression
may accurately model the remaining users. Future work will use a
model that can better account for some global consistency.
4.5 Summary
RQ1 — Baseline Gender Distribution Known books are sig-
nicantly more likely to be written by men than by women;
representation among rated books is more balanced.
RQ2 — User Input Gender Distributions User are diuse in
their rating tendencies, with an overall trend favoring male
authors but less strongly than the baseline distribution.
RQ3 — Recommender Output Distributions Dierent CF

Mais conteúdo relacionado

Semelhante a RecSys2018論文読み会 資料

YamadaiR(Categorical Factor Analysis)
YamadaiR(Categorical Factor Analysis)YamadaiR(Categorical Factor Analysis)
YamadaiR(Categorical Factor Analysis)考司 小杉
 
Galambos N Analysis Of Survey Results
Galambos N Analysis Of Survey ResultsGalambos N Analysis Of Survey Results
Galambos N Analysis Of Survey ResultsNora Galambos
 
Faster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware ClassificationFaster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware ClassificationSilvio Cesare
 
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...GigaScience, BGI Hong Kong
 
Hepatic injury classification
Hepatic injury classificationHepatic injury classification
Hepatic injury classificationZheliang Jiang
 
Statistical analysis of correlated data using generalized estimating equation...
Statistical analysis of correlated data using generalized estimating equation...Statistical analysis of correlated data using generalized estimating equation...
Statistical analysis of correlated data using generalized estimating equation...Angelina Lessa
 
Chapter 03 scatterplots and correlation
Chapter 03 scatterplots and correlationChapter 03 scatterplots and correlation
Chapter 03 scatterplots and correlationHamdy F. F. Mahmoud
 
Engineering Statistics
Engineering Statistics Engineering Statistics
Engineering Statistics Bahzad5
 
Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2Dennis Sweitzer
 
Psy 870 module 3 problem set answers
Psy 870  module 3 problem set answersPsy 870  module 3 problem set answers
Psy 870 module 3 problem set answersbestwriter
 
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...CSCJournals
 
analysis part 02.pptx
analysis part 02.pptxanalysis part 02.pptx
analysis part 02.pptxefrembeyene4
 
1 Assignment Quantitative Methods 2 The following ass.docx
1 Assignment Quantitative Methods 2 The following ass.docx1 Assignment Quantitative Methods 2 The following ass.docx
1 Assignment Quantitative Methods 2 The following ass.docxteresehearn
 

Semelhante a RecSys2018論文読み会 資料 (20)

YamadaiR(Categorical Factor Analysis)
YamadaiR(Categorical Factor Analysis)YamadaiR(Categorical Factor Analysis)
YamadaiR(Categorical Factor Analysis)
 
Lab 1 intro
Lab 1 introLab 1 intro
Lab 1 intro
 
lecture_4.pptx
lecture_4.pptxlecture_4.pptx
lecture_4.pptx
 
Galambos N Analysis Of Survey Results
Galambos N Analysis Of Survey ResultsGalambos N Analysis Of Survey Results
Galambos N Analysis Of Survey Results
 
Faster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware ClassificationFaster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware Classification
 
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
 
Hepatic injury classification
Hepatic injury classificationHepatic injury classification
Hepatic injury classification
 
Statistical analysis of correlated data using generalized estimating equation...
Statistical analysis of correlated data using generalized estimating equation...Statistical analysis of correlated data using generalized estimating equation...
Statistical analysis of correlated data using generalized estimating equation...
 
Chapter 03 scatterplots and correlation
Chapter 03 scatterplots and correlationChapter 03 scatterplots and correlation
Chapter 03 scatterplots and correlation
 
Engineering Statistics
Engineering Statistics Engineering Statistics
Engineering Statistics
 
Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Statistics
StatisticsStatistics
Statistics
 
Psy 870 module 3 problem set answers
Psy 870  module 3 problem set answersPsy 870  module 3 problem set answers
Psy 870 module 3 problem set answers
 
Slides_SB3.ppt
Slides_SB3.pptSlides_SB3.ppt
Slides_SB3.ppt
 
Slides_SB3.ppt
Slides_SB3.pptSlides_SB3.ppt
Slides_SB3.ppt
 
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
 
analysis part 02.pptx
analysis part 02.pptxanalysis part 02.pptx
analysis part 02.pptx
 
1 Assignment Quantitative Methods 2 The following ass.docx
1 Assignment Quantitative Methods 2 The following ass.docx1 Assignment Quantitative Methods 2 The following ass.docx
1 Assignment Quantitative Methods 2 The following ass.docx
 
Logistic regression teaching
Logistic regression teachingLogistic regression teaching
Logistic regression teaching
 

Mais de Toshihiro Kamishima

機械学習研究でのPythonの利用
機械学習研究でのPythonの利用機械学習研究でのPythonの利用
機械学習研究でのPythonの利用Toshihiro Kamishima
 
Considerations on Recommendation Independence for a Find-Good-Items Task
Considerations on Recommendation Independence for a Find-Good-Items TaskConsiderations on Recommendation Independence for a Find-Good-Items Task
Considerations on Recommendation Independence for a Find-Good-Items TaskToshihiro Kamishima
 
Model-based Approaches for Independence-Enhanced Recommendation
Model-based Approaches for Independence-Enhanced RecommendationModel-based Approaches for Independence-Enhanced Recommendation
Model-based Approaches for Independence-Enhanced RecommendationToshihiro Kamishima
 
科学技術計算関連Pythonパッケージの概要
科学技術計算関連Pythonパッケージの概要科学技術計算関連Pythonパッケージの概要
科学技術計算関連Pythonパッケージの概要Toshihiro Kamishima
 
Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, a...
Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, a...Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, a...
Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, a...Toshihiro Kamishima
 
Correcting Popularity Bias by Enhancing Recommendation Neutrality
Correcting Popularity Bias by Enhancing Recommendation NeutralityCorrecting Popularity Bias by Enhancing Recommendation Neutrality
Correcting Popularity Bias by Enhancing Recommendation NeutralityToshihiro Kamishima
 
PyMCがあれば,ベイズ推定でもう泣いたりなんかしない
PyMCがあれば,ベイズ推定でもう泣いたりなんかしないPyMCがあれば,ベイズ推定でもう泣いたりなんかしない
PyMCがあれば,ベイズ推定でもう泣いたりなんかしないToshihiro Kamishima
 
The Independence of Fairness-aware Classifiers
The Independence of Fairness-aware ClassifiersThe Independence of Fairness-aware Classifiers
The Independence of Fairness-aware ClassifiersToshihiro Kamishima
 
Efficiency Improvement of Neutrality-Enhanced Recommendation
Efficiency Improvement of Neutrality-Enhanced RecommendationEfficiency Improvement of Neutrality-Enhanced Recommendation
Efficiency Improvement of Neutrality-Enhanced RecommendationToshihiro Kamishima
 
Absolute and Relative Clustering
Absolute and Relative ClusteringAbsolute and Relative Clustering
Absolute and Relative ClusteringToshihiro Kamishima
 
Consideration on Fairness-aware Data Mining
Consideration on Fairness-aware Data MiningConsideration on Fairness-aware Data Mining
Consideration on Fairness-aware Data MiningToshihiro Kamishima
 
Fairness-aware Classifier with Prejudice Remover Regularizer
Fairness-aware Classifier with Prejudice Remover RegularizerFairness-aware Classifier with Prejudice Remover Regularizer
Fairness-aware Classifier with Prejudice Remover RegularizerToshihiro Kamishima
 
Enhancement of the Neutrality in Recommendation
Enhancement of the Neutrality in RecommendationEnhancement of the Neutrality in Recommendation
Enhancement of the Neutrality in RecommendationToshihiro Kamishima
 
OpenOpt の線形計画で圧縮センシング
OpenOpt の線形計画で圧縮センシングOpenOpt の線形計画で圧縮センシング
OpenOpt の線形計画で圧縮センシングToshihiro Kamishima
 
Fairness-aware Learning through Regularization Approach
Fairness-aware Learning through Regularization ApproachFairness-aware Learning through Regularization Approach
Fairness-aware Learning through Regularization ApproachToshihiro Kamishima
 

Mais de Toshihiro Kamishima (20)

WSDM2018読み会 資料
WSDM2018読み会 資料WSDM2018読み会 資料
WSDM2018読み会 資料
 
Recommendation Independence
Recommendation IndependenceRecommendation Independence
Recommendation Independence
 
機械学習研究でのPythonの利用
機械学習研究でのPythonの利用機械学習研究でのPythonの利用
機械学習研究でのPythonの利用
 
Considerations on Recommendation Independence for a Find-Good-Items Task
Considerations on Recommendation Independence for a Find-Good-Items TaskConsiderations on Recommendation Independence for a Find-Good-Items Task
Considerations on Recommendation Independence for a Find-Good-Items Task
 
Model-based Approaches for Independence-Enhanced Recommendation
Model-based Approaches for Independence-Enhanced RecommendationModel-based Approaches for Independence-Enhanced Recommendation
Model-based Approaches for Independence-Enhanced Recommendation
 
KDD2016勉強会 資料
KDD2016勉強会 資料KDD2016勉強会 資料
KDD2016勉強会 資料
 
科学技術計算関連Pythonパッケージの概要
科学技術計算関連Pythonパッケージの概要科学技術計算関連Pythonパッケージの概要
科学技術計算関連Pythonパッケージの概要
 
WSDM2016勉強会 資料
WSDM2016勉強会 資料WSDM2016勉強会 資料
WSDM2016勉強会 資料
 
ICML2015読み会 資料
ICML2015読み会 資料ICML2015読み会 資料
ICML2015読み会 資料
 
Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, a...
Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, a...Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, a...
Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, a...
 
Correcting Popularity Bias by Enhancing Recommendation Neutrality
Correcting Popularity Bias by Enhancing Recommendation NeutralityCorrecting Popularity Bias by Enhancing Recommendation Neutrality
Correcting Popularity Bias by Enhancing Recommendation Neutrality
 
PyMCがあれば,ベイズ推定でもう泣いたりなんかしない
PyMCがあれば,ベイズ推定でもう泣いたりなんかしないPyMCがあれば,ベイズ推定でもう泣いたりなんかしない
PyMCがあれば,ベイズ推定でもう泣いたりなんかしない
 
The Independence of Fairness-aware Classifiers
The Independence of Fairness-aware ClassifiersThe Independence of Fairness-aware Classifiers
The Independence of Fairness-aware Classifiers
 
Efficiency Improvement of Neutrality-Enhanced Recommendation
Efficiency Improvement of Neutrality-Enhanced RecommendationEfficiency Improvement of Neutrality-Enhanced Recommendation
Efficiency Improvement of Neutrality-Enhanced Recommendation
 
Absolute and Relative Clustering
Absolute and Relative ClusteringAbsolute and Relative Clustering
Absolute and Relative Clustering
 
Consideration on Fairness-aware Data Mining
Consideration on Fairness-aware Data MiningConsideration on Fairness-aware Data Mining
Consideration on Fairness-aware Data Mining
 
Fairness-aware Classifier with Prejudice Remover Regularizer
Fairness-aware Classifier with Prejudice Remover RegularizerFairness-aware Classifier with Prejudice Remover Regularizer
Fairness-aware Classifier with Prejudice Remover Regularizer
 
Enhancement of the Neutrality in Recommendation
Enhancement of the Neutrality in RecommendationEnhancement of the Neutrality in Recommendation
Enhancement of the Neutrality in Recommendation
 
OpenOpt の線形計画で圧縮センシング
OpenOpt の線形計画で圧縮センシングOpenOpt の線形計画で圧縮センシング
OpenOpt の線形計画で圧縮センシング
 
Fairness-aware Learning through Regularization Approach
Fairness-aware Learning through Regularization ApproachFairness-aware Learning through Regularization Approach
Fairness-aware Learning through Regularization Approach
 

Último

Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
Decision Making Under Uncertainty - Is It Better Off Joining a Partnership or...
Decision Making Under Uncertainty - Is It Better Off Joining a Partnership or...Decision Making Under Uncertainty - Is It Better Off Joining a Partnership or...
Decision Making Under Uncertainty - Is It Better Off Joining a Partnership or...ThinkInnovation
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...ThinkInnovation
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfnikeshsingh56
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformationAnnie Melnic
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are successPratikSingh115843
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etclalithasri22
 

Último (16)

2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Decision Making Under Uncertainty - Is It Better Off Joining a Partnership or...
Decision Making Under Uncertainty - Is It Better Off Joining a Partnership or...Decision Making Under Uncertainty - Is It Better Off Joining a Partnership or...
Decision Making Under Uncertainty - Is It Better Off Joining a Partnership or...
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdf
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformation
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are success
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etc
 

RecSys2018論文読み会 資料

  • 1. Exploring Author Gender in Book Rating and Recommendation M. D. Ekstrand et al. 1
  • 3. 3
  • 4. 4 RecSys ’18, October 2–7, 2018, Vancouver, BC, Canada u unu µ ¯ua ua ¯ua ¯nua ba sa a u 2 U a 2 A u u a
  • 5. 5 RecSys ’18, October 2–7, 2018, Vancouver, BC, Can u unu µ ¯ua ¯ua ba sa u 2 U Binomial(nu, θu)NegBinomial(ν, γ) logit(θu) Normal(μ, σ)
  • 6. 6 ober 2–7, 2018, Vancouver, BC, Canada u u µ ¯ua ua ¯ua ¯nua ba sa a a 2 A Table Variab n ¯nu ¯u logit( ) Normal( + logit( ), 2)<latexit sha1_base64="WiSy2qnMJnJn/Jh+eBgx0ac955E=">AAADb3ichVLPaxNBFP6a1Vrrj8Z6UBBkMEQSlDIJQqWnohe9SJs0baBbl9l1mizdX+xMQuuy/4AXjx68qOBB/DO8ePLmoX+CeJIW9GDFt5sNakvrLDvz5pv3fe/Nm2dHnqs057sTJePU6ckzU2enz52/cHGmfGl2VYWD2JEdJ/TCuGsLJT03kB3tak92o1gK3/bkmr11PztfG8pYuWGwonciueGLXuBuuo7QBFnlvukL3Y/9xAt7rk5rpi3ixNR9qUVqJQOR1pmpXJ+N3R6FsS+8tGZbgt1iiubDAjnXGtRvs4SYPV9YIn3crFvlCp/j+WBHjUZhVFCMpbD8AyaeIISDAXxIBNBkexBQ9K2jAY6IsA0khMVkufm5RIpp4g7IS5KHIHSL5h7t1gs0oH2mqXK2Q1E8+mNiMlT5Z/6O7/GP/D3/wn8eq5XkGlkuO7TaI66MrJlnV9vf/8vyadXo/2GdmLPGJu7mubqUe5Qj2S2cEX/49MVee6FVTW7yN/wr5f+a7/IPdINguO+8XZatl6ReBXt+0HrVmjwhUkBVyJS3i7qqvKrbaI7i0D/Om1FGYf4mC2S3sYKH6P6FHn//scK4blntVfZm1CKNww1x1FhtzjXIXr5TWbxXNMsUruEGatQR81jEAyyhQxE/YR8H+FX6Zlwxrhts5FqaKDiX8c8w6r8BoCnTFQ==</latexit><latexit sha1_base64="WiSy2qnMJnJn/Jh+eBgx0ac955E=">AAADb3ichVLPaxNBFP6a1Vrrj8Z6UBBkMEQSlDIJQqWnohe9SJs0baBbl9l1mizdX+xMQuuy/4AXjx68qOBB/DO8ePLmoX+CeJIW9GDFt5sNakvrLDvz5pv3fe/Nm2dHnqs057sTJePU6ckzU2enz52/cHGmfGl2VYWD2JEdJ/TCuGsLJT03kB3tak92o1gK3/bkmr11PztfG8pYuWGwonciueGLXuBuuo7QBFnlvukL3Y/9xAt7rk5rpi3ixNR9qUVqJQOR1pmpXJ+N3R6FsS+8tGZbgt1iiubDAjnXGtRvs4SYPV9YIn3crFvlCp/j+WBHjUZhVFCMpbD8AyaeIISDAXxIBNBkexBQ9K2jAY6IsA0khMVkufm5RIpp4g7IS5KHIHSL5h7t1gs0oH2mqXK2Q1E8+mNiMlT5Z/6O7/GP/D3/wn8eq5XkGlkuO7TaI66MrJlnV9vf/8vyadXo/2GdmLPGJu7mubqUe5Qj2S2cEX/49MVee6FVTW7yN/wr5f+a7/IPdINguO+8XZatl6ReBXt+0HrVmjwhUkBVyJS3i7qqvKrbaI7i0D/Om1FGYf4mC2S3sYKH6P6FHn//scK4blntVfZm1CKNww1x1FhtzjXIXr5TWbxXNMsUruEGatQR81jEAyyhQxE/YR8H+FX6Zlwxrhts5FqaKDiX8c8w6r8BoCnTFQ==</latexit><latexit sha1_base64="WiSy2qnMJnJn/Jh+eBgx0ac955E=">AAADb3ichVLPaxNBFP6a1Vrrj8Z6UBBkMEQSlDIJQqWnohe9SJs0baBbl9l1mizdX+xMQuuy/4AXjx68qOBB/DO8ePLmoX+CeJIW9GDFt5sNakvrLDvz5pv3fe/Nm2dHnqs057sTJePU6ckzU2enz52/cHGmfGl2VYWD2JEdJ/TCuGsLJT03kB3tak92o1gK3/bkmr11PztfG8pYuWGwonciueGLXuBuuo7QBFnlvukL3Y/9xAt7rk5rpi3ixNR9qUVqJQOR1pmpXJ+N3R6FsS+8tGZbgt1iiubDAjnXGtRvs4SYPV9YIn3crFvlCp/j+WBHjUZhVFCMpbD8AyaeIISDAXxIBNBkexBQ9K2jAY6IsA0khMVkufm5RIpp4g7IS5KHIHSL5h7t1gs0oH2mqXK2Q1E8+mNiMlT5Z/6O7/GP/D3/wn8eq5XkGlkuO7TaI66MrJlnV9vf/8vyadXo/2GdmLPGJu7mubqUe5Qj2S2cEX/49MVee6FVTW7yN/wr5f+a7/IPdINguO+8XZatl6ReBXt+0HrVmjwhUkBVyJS3i7qqvKrbaI7i0D/Om1FGYf4mC2S3sYKH6P6FHn//scK4blntVfZm1CKNww1x1FhtzjXIXr5TWbxXNMsUruEGatQR81jEAyyhQxE/YR8H+FX6Zlwxrhts5FqaKDiX8c8w6r8BoCnTFQ==</latexit><latexit sha1_base64="WiSy2qnMJnJn/Jh+eBgx0ac955E=">AAADb3ichVLPaxNBFP6a1Vrrj8Z6UBBkMEQSlDIJQqWnohe9SJs0baBbl9l1mizdX+xMQuuy/4AXjx68qOBB/DO8ePLmoX+CeJIW9GDFt5sNakvrLDvz5pv3fe/Nm2dHnqs057sTJePU6ckzU2enz52/cHGmfGl2VYWD2JEdJ/TCuGsLJT03kB3tak92o1gK3/bkmr11PztfG8pYuWGwonciueGLXuBuuo7QBFnlvukL3Y/9xAt7rk5rpi3ixNR9qUVqJQOR1pmpXJ+N3R6FsS+8tGZbgt1iiubDAjnXGtRvs4SYPV9YIn3crFvlCp/j+WBHjUZhVFCMpbD8AyaeIISDAXxIBNBkexBQ9K2jAY6IsA0khMVkufm5RIpp4g7IS5KHIHSL5h7t1gs0oH2mqXK2Q1E8+mNiMlT5Z/6O7/GP/D3/wn8eq5XkGlkuO7TaI66MrJlnV9vf/8vyadXo/2GdmLPGJu7mubqUe5Qj2S2cEX/49MVee6FVTW7yN/wr5f+a7/IPdINguO+8XZatl6ReBXt+0HrVmjwhUkBVyJS3i7qqvKrbaI7i0D/Om1FGYf4mC2S3sYKH6P6FHn//scK4blntVfZm1CKNww1x1FhtzjXIXr5TWbxXNMsUruEGatQR81jEAyyhQxE/YR8H+FX6Zlwxrhts5FqaKDiX8c8w6r8BoCnTFQ==</latexit>
  • 7. 7 btain author information from (VIAF)3, a directory of author ity records from the Library of und the world. Author gender s for many records. mployed by the VIAF is exible ender identities, supporting an es for the validity of an identity. se exibility — all its assertions This is a signicant limitation on 5.1. book data with rating data by ve data linking coverage, and works instead of individual edi- m a bipartite graph of ISBNs and “edition” records, and OpenLi- e) and consider each connected ess than 1% of ratings) this caus- or a book; we resolve multiple ir ratings. VIAF do not share linking iden- hority records by author name. ontain multiple name entries, izations of the author’s name. arry multiple known forms of ng names to improve matching ng both “Last, First” and “First e all VIAF records containing a d names for the rst author of n a book’s cluster. If all records hor’s gender agree, we take that ontradicting gender statements, as “ambiguous”. ure good coverage while main- Table 2: Summary of rating data BookCrossing Amazon Ratings 1,149,780 22,507,155 Users 105,283 8,026,324 Rated ISBNs/ASINs 340,554 2,330,066 Rated ‘Books’ 295,935 2,286,656 Matched Books 240,255 1,083,066 Known-Gender Books 166,928 616,317 Female-Author Books 66,524 181,850 Male-Author Books 100,404 434,467 % Female Books 39.9% 29.5% % Female Ratings 45.3% 36.2% BXA BXE LOC AZ fem ale m ale am biguousunknow nunlinked fem ale m ale am biguousunknow nunlinked 0% 20% 40% 60% 0% 10% 20% 30% 40% 0% 10% 20% 30% 40% 0% 10% 20% 30% 40% Linking Result CoveragePercent Scope Books Ratings Figure 1: Results of data linking and gender resolution. LOC is the set of books with Library of Congress records; other panes are the results of linking rating data.
  • 8. 8 dependent TAN 2.17.3 each per- We report arameters h existing acterizing nalyze the Tables 1– sample of nders are in our cat- has a more ookCross- wn-gender oportions (est. sd log odds) 1.03 1.11 1.77 Posterior Mean 0.42 0.40 0.37 Std. Dev. 0.23 0.23 0.28 AZBXABXE 0.00 0.25 0.50 0.75 1.00 0 1 2 3 4 0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0 Proportion of Female Authors Density Method Estimated θ Observed y/n Predicted y/n Figure 4: Distribution of user author-gender tendencies. His- togram shows observed proportions; lines show kernel den- sities of estimated tendencies ( 0) along with observed and predicted proportions. and Figure 4 shows the distribution of observed author gender
  • 9. 9 Users Dist. Items % Dist. Users Dist. Items % Dist. Users Dist. Items % Dist. Users Dist. Items % Dist. Prole 1,000 35,187 66.5 1,000 24,913 73.6 1,000 27,525 88.2 1,000 27,525 88.2 UserUser 1,000 6,007 12.0 988 6,235 12.7 1,000 15,343 30.7 939 25,853 55.1 ItemItem 1,000 21,282 42.6 997 10,174 20.4 999 33,363 67.7 999 22,360 45.6 MF 1,000 140 0.3 1,000 264 0.5 1,000 164 0.3 1,000 651 1.3 PF 1,000 1,506 3.0 1,000 4,105 8.2 1,000 2,746 5.4 1,000 3,538 7.0 AZ (Explicit) AZ (Implicit) BXA BXE UserUserItemItemMFPF 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0 1 2 3 0 1 2 3 4 0 10 20 0 1 2 3 4 Proportion of Books by Female Authors Density Mean Algorithm Popular Profile Method Observed Predicted Figure 5: Posterior densities of recommender biases from integrated regression model. proportions. The ripples in predicted and observed proportions are due to the commonality of 5-item user proles, for which there are only 6 possible proportions; estimated tendency ( ) smooths them out. This smoothing, along with avoiding estimated extreme biases based on limited data, are why we nd it useful to estimate tenden- cy instead of directly computing statistics on observed proportions. To support direct comparison of the densities of observations and predictions, we resampled observed proportions with replacement to yield 10,000 observations. We observe a population tendency to rate male authors more frequently than female authors in all data sets (µ 0), but to rate female authors more frequently than they would be rated were users drawing books uniformly at random from the available set. The average user author-gender tendency is slightly closer to an even balance than the set of rated books. We also found a large diversity amongst users about their estimated tendencies (s.d. of Table 6: Mean / SD of rec. list female author proportions. BXA BXE AZ (Implicit) AZ (Explicit) Popular 0.458 0.500 0.364 0.364 Rating — 0.383 — 0.222 UserUser 0.399 / 0.180 0.435 / 0.190 0.315 / 0.186 0.367 / 0.278 ItemItem 0.465 / 0.200 0.348 / 0.124 0.351 / 0.245 0.389 / 0.336 MF 0.134 / 0.027 0.334 / 0.039 0.468 / 0.079 0.418 / 0.124 PF 0.372 / 0.208 0.429 / 0.177 0.374 / 0.144 0.394 / 0.177 basic coverage statistics of these algorithms along with correspond- ing user prole statistics. Users for which an algorithm could not produce recommendations are rare. We also computed the extent to which algorithms recommend dierent items to dierent users; “% Dist.” is the percentage of all recommendations that were distinct items. Algorithms that repeatedly recommend the same items will
  • 10. 10 BXE -0.139 0.162 0.906 -0.573 0.129 0.531 -0.652 0.002 0.161 -0.166 0.298 0.772 (-0.20,-0.08) (0.10,0.22) (0.87,0.95) (-0.61,-0.54) (0.09,0.16) (0.51,0.56) (-0.66,-0.64) (-0.01,0.01) (0.15,0.17) (-0.22,-0.11) (0.25,0.35) (0.74,0.81) AZ (Implicit) -0.127 0.688 0.715 0.094 0.863 0.895 -0.244 0.011 0.364 -0.224 0.287 0.537 (-0.19,-0.06) (0.65,0.73) (0.68,0.76) (0.02,0.17) (0.81,0.92) (0.84,0.95) (-0.27,-0.22) (-0.00,0.02) (0.35,0.38) (-0.26,-0.18) (0.26,0.31) (0.51,0.56) AZ (Explicit) -0.580 0.322 0.681 -0.380 0.438 0.852 -0.117 0.006 0.273 -0.403 0.141 0.525 (-0.63,-0.53) (0.29,0.35) (0.65,0.71) (-0.44,-0.32) (0.40,0.48) (0.81,0.89) (-0.14,-0.10) (-0.00,0.02) (0.26,0.29) (-0.44,-0.37) (0.12,0.16) (0.50,0.55) AZ (Explicit) AZ (Implicit) BXA BXE UserUserItemItemMFPF 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 Profile Proportion of Female Authors RecommenderProportionofFemaleAuthors Figure 6: Scatter plots and regression curves for recommender response to individual users. more concentrated. In the BookCrossing data, it tends to favor male authors more than the underlying data would support; in implic- it feedback mode, it is highly biased towards male authors with respect even to the baseline distributions. 4.4 From Proles to Recommendations Our extended Bayesian model (Section 3.4.2) allows us to address RQ4: the extent to which our algorithms propagate individual users’ tendencies into their recommendations (RQ4). Figure 5 shows the posterior predictive and observed densities of recommender author-gender tendencies, and Figure 6 shows scatter plots of observed recommendation proportions against user prole proportions with regression curves (regression lines in log- place. Visual inspection of the scatter plot suggests that there is a strong component with consistent tendencies, but the regression may accurately model the remaining users. Future work will use a model that can better account for some global consistency. 4.5 Summary RQ1 — Baseline Gender Distribution Known books are sig- nicantly more likely to be written by men than by women; representation among rated books is more balanced. RQ2 — User Input Gender Distributions User are diuse in their rating tendencies, with an overall trend favoring male authors but less strongly than the baseline distribution. RQ3 — Recommender Output Distributions Dierent CF