1. Approximating Matrices for Recommendation
Raghav Somani 1 Sreangsu Acharyya 2
2MSRI 1IITG
August 3, 2016
Approximating Matrices for Recommendation August 3, 2016 1 / 20
2. The Stories we Tell
Once upon a time a user liked a movie|book|item
recommendation · · · · · · and kept buying from us ever
after · · ·
Because, Low-Rank factorization
Approximating Matrices for Recommendation August 3, 2016 2 / 20
3. Matrix Factorization for Recommendation
We factorize matrices. Rajnikant factorizes primes
Approximating Matrices for Recommendation August 3, 2016 3 / 20
4. Matrix Factorization for Recommendation
M : Rating matrix of size m (Users) , n (Movies)
Mi,j ∈ {1, 2, · · · , 5, ?}
Fill the ’?’ with predictions
Find an approximation ˆM M
M could be query × url matrix
(Can we get analogs of "queen" - "king" = "woman" for urls ?)
Approximating Matrices for Recommendation August 3, 2016 4 / 20
5. Why Low Rank ?
Factor vj ∈ Rk for movie j
Is Mr. R in the Movie ?
Factor uj ∈ Rk for user i
How much does i like Mr. R
Mi,j = ui, vj
M UV†
UV† is rank k
Approximating Matrices for Recommendation August 3, 2016 5 / 20
6. The Usual Cost Function
1
2
M − UV† 2
+ λ1 U 2
+ λ2 V 2
u step: Fix V minimize over U
v step: Fix U minimize over V
works well: Despite non-convexity
Approximating Matrices for Recommendation August 3, 2016 6 / 20
7. Hold on a Second !
True rating scores : [2.7, 3.0, 3.4]
Prediction1: [0.10, 0.12, 42]
Prediction2: [3.0, 3.2, 3.1]
Least squares would choose P2. Wrong choice.
Why fit M closely in least squares sense ?
If it helps, replace a row by monotonically transformed values
Approximating Matrices for Recommendation August 3, 2016 7 / 20
8. Personalized Rating Transformations
Goal: Transform the scores so that it is easy to approximate
1 Learn a rating transformation: σi(·) : R → R for every user i
2 Learn a common transformation function for all users
3 Learn 1 < < m transformation functions
(one for the extremists, another for the moderates, ...)
Σ(·) : Rmn
→ Rmn
Approximating Matrices for Recommendation August 3, 2016 8 / 20
9. Updated Optimization Problem
Min
U,V,Σ∈?
1
2
Σ(M) − UV† 2
+ λ1 U 2
+ λ2 V 2
u step: Fix V, Σ minimize over U
v step: Fix U, Σ minimize over V
Σ step: Fix U, V minimize over Σ. But How ?
Approximating Matrices for Recommendation August 3, 2016 9 / 20
10. Progress
We began with M. We knew exactly which matrix to approximate.
We changed it into a problem where we dont even know the specific
matrix to approximate.
I call it progress.
More charitably: Not restricted to approximating a specific matrix. We
are good, as long as we can approximate any monotonically
transformed matrix
Approximating Matrices for Recommendation August 3, 2016 10 / 20
11. Properties of Strictly Monotonic Functions
1 Closed under addition
2 Closed under scaling by a +ve const
3 Closed under composition
4 Closed under taking an inverse
5 Not a vector space (cant subtract)
We exploit the first two. Dont know yet how to exploit the group
property.
Approximating Matrices for Recommendation August 3, 2016 11 / 20
12. Optimization Variants
1 Learn a rating transformation for every user i
2 Learn a common transformation function for all users
3 Learn 1 < < m transformation functions
Min
U,V,Σ∈?
1
2
Σ(M) − UV† 2
+ λ1 U 2
+ λ2 V 2
Jointly convex formulation : Straightforward via semidefinite
programming (1, 2)
u step: Fix V, Σ minimize over U
v step: Fix U, Σ minimize over V
Σ step: Fix U, V minimize over Σ.
Assing step: Assign users to the best transformation for variant (3)
Approximating Matrices for Recommendation August 3, 2016 12 / 20
19. Bregman Divergence
z
φ(z)
y
dφ(x, y)
x
Given a strictly convex φ(·) sufficiently nice the corresponding
Bregman divergence is
Dφ x y = φ(x) − φ(y) − (x − y), φ(y) .
1
1
Thanks Ayan Acharya for his tikz magic
Approximating Matrices for Recommendation August 3, 2016 19 / 20
20. Examples of Bregman Divergences
φ(x) ( φ)−1
(x) B Divergence
1
2 ||x||2 x Sq Euclidean
1
2 ||x||2
W Wx Mahalonobis
i xi log xi x ∈ ∆ exp(x)
i exp xi
KL divergence
i(xi log xi − xi) x ∈ R+
d
exp y I-divergence
Table: Bregman divergences and link functions
Approximating Matrices for Recommendation August 3, 2016 20 / 20