MSRtalk

Approximating Matrices for Recommendation
Raghav Somani 1 Sreangsu Acharyya 2
2MSRI 1IITG
August 3, 2016
Approximating Matrices for Recommendation August 3, 2016 1 / 20

The Stories we Tell
Once upon a time a user liked a movie|book|item
recommendation · · · · · · and kept buying from us ever
after · · ·
Because, Low-Rank factorization

Matrix Factorization for Recommendation
We factorize matrices. Rajnikant factorizes primes

Matrix Factorization for Recommendation
M : Rating matrix of size m (Users) , n (Movies)
Mi,j ∈ {1, 2, · · · , 5, ?}
Fill the ’?’ with predictions
Find an approximation ˆM M
M could be query × url matrix
(Can we get analogs of "queen" - "king" = "woman" for urls ?)

Why Low Rank ?
Factor vj ∈ Rk for movie j
Is Mr. R in the Movie ?
Factor uj ∈ Rk for user i
How much does i like Mr. R
Mi,j = ui, vj
M UV†
UV† is rank k

The Usual Cost Function
1
2
M − UV† 2
+ λ1 U 2
+ λ2 V 2
u step: Fix V minimize over U
v step: Fix U minimize over V
works well: Despite non-convexity

Hold on a Second !
True rating scores : [2.7, 3.0, 3.4]
Prediction1: [0.10, 0.12, 42]
Prediction2: [3.0, 3.2, 3.1]
Least squares would choose P2. Wrong choice.
Why ﬁt M closely in least squares sense ?
If it helps, replace a row by monotonically transformed values

Personalized Rating Transformations
Goal: Transform the scores so that it is easy to approximate
1 Learn a rating transformation: σi(·) : R → R for every user i
2 Learn a common transformation function for all users
3 Learn 1 < < m transformation functions
(one for the extremists, another for the moderates, ...)
Σ(·) : Rmn
→ Rmn

Updated Optimization Problem
Min
U,V,Σ∈?
1
2
Σ(M) − UV† 2
+ λ1 U 2
+ λ2 V 2
u step: Fix V, Σ minimize over U
v step: Fix U, Σ minimize over V
Σ step: Fix U, V minimize over Σ. But How ?

Progress
We began with M. We knew exactly which matrix to approximate.
We changed it into a problem where we dont even know the speciﬁc
matrix to approximate.
I call it progress.
More charitably: Not restricted to approximating a speciﬁc matrix. We
are good, as long as we can approximate any monotonically
transformed matrix

Properties of Strictly Monotonic Functions
1 Closed under addition
2 Closed under scaling by a +ve const
3 Closed under composition
4 Closed under taking an inverse
5 Not a vector space (cant subtract)
We exploit the ﬁrst two. Dont know yet how to exploit the group
property.

Optimization Variants
1 Learn a rating transformation for every user i
2 Learn a common transformation function for all users
3 Learn 1 < < m transformation functions
Min
U,V,Σ∈?
1
2
Σ(M) − UV† 2
+ λ1 U 2
+ λ2 V 2
Jointly convex formulation : Straightforward via semideﬁnite
programming (1, 2)
u step: Fix V, Σ minimize over U
v step: Fix U, Σ minimize over V
Σ step: Fix U, V minimize over Σ.
Assing step: Assign users to the best transformation for variant (3)

Geometric Illustration
Given Score
Vector
R↓
˜r
Range Space
Regressor
of
Minimizes Sqr. Loss
violates order
but

No Theorems (on Global Minimum) Yet
Sorry

Experimental Results [Test-loss Movie-Lens Small]

[Test-loss Movie-Lens Medium]

[Kendall-Tau Movie-Lens Medium]

Thank You

Bregman Divergence
z
φ(z)
y
dφ(x, y)
x
Given a strictly convex φ(·) sufﬁciently nice the corresponding
Bregman divergence is
Dφ x y = φ(x) − φ(y) − (x − y), φ(y) .
1
1
Thanks Ayan Acharya for his tikz magic

Examples of Bregman Divergences
φ(x) ( φ)−1
(x) B Divergence
1
2 ||x||2 x Sq Euclidean
1
2 ||x||2
W Wx Mahalonobis
i xi log xi x ∈ ∆ exp(x)
i exp xi
KL divergence
i(xi log xi − xi) x ∈ R+
d
exp y I-divergence
Table: Bregman divergences and link functions

MSRtalk

Recomendados

Recomendados

Mais conteúdo relacionado

Destaque

Destaque (10)

Semelhante a MSRtalk

Semelhante a MSRtalk (20)

MSRtalk