Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
Próximos SlideShares
Carregando em…5
×

de

51

Compartilhar

# Introduction to Matrix Factorization Methods Collaborative Filtering

http://www.intelligentmining.com/category/knowledge-base/

Introduction to Matrix Factorization Methods Collaborative Filtering

Ver tudo

Ver tudo

### Introduction to Matrix Factorization Methods Collaborative Filtering

1. 1. INTRODUCTION TO MATRIX FACTORIZATION proprietary material METHODS COLLABORATIVE FILTERING USER RATINGS PREDICTION1 Alex Lin Senior Architect Intelligent Mining
2. 2. Outline  Factoranalysis  Matrix decomposition proprietary material  Matrix Factorization Model  Minimizing Cost Function  Common Implementation 2
3. 3. Factor Analysis  Aprocedure can help identify the factors that might be used to explain the interrelationships among the variables proprietary material  Model based approach 3
4. 4. Refresher: Matrix Decomposition r q p 5 x 6 matrix 5 x 3 matrix 3 x 6 matrixX11 X12 X13 X14 X15 X16 x proprietary materialX21 X22 X12 X24 X25 X26 yX31 X32 X33 X34 X35 X36 a b c zX41 X42 X43 X44 X45 X46X51 X52 X53 X54 X55 X56X32 = (a, b, c) . (x, y, z) = a * x + b * y + c * z Rating Prediction rui = qT pu ˆ i User Preference Factor Vector 4 Movie Preference Factor Vector
5. 5. Making Prediction as Filling MissingValue users 1 2 3 4 5 6 7 8 9 10 … n proprietary material 1 5 4 4 3 2 3 ? items 3 3 ? 5 ? 2 4 3 3 4 3 4 4 … m rui = qT pu ˆ i User Preference Factor Vector 5 Rating Prediction Movie Preference Factor Vector €
6. 6. Learn Factor Vectors users 1 2 3 4 5 6 7 8 9 10 … n 1 5 4 4 3 proprietary material 2 3 ? items 3 3 ? 5 ? 2 4 3 3 4 3 4 4 … 4 = U3-1 * I1-1 + U3-2 * I1-2 + U3-3 * I1-3 + U3-4 * I1-4 m 3 = U7-1 * I2-1 + U7-2 * I2-2 + U7-3 * I2-3 + U7-4 * I2-4 ….. 3 = U86-1 * I12-1 + U86-2 * I12-2 + U86-3 * I12-3 + U86-4 * I12-4 Note: only train on known entries 2X + 3Y = 5 2X + 3Y = 5 6 4X - 2Y = 2 4X - 2Y = 2 3X - 2Y = 2
7. 7. Why not use standard SVD?  Standard SVD assumes all missing entries are zero. This leads to bad prediction accuracy, especially when dataset is extremely sparse. (98% proprietary material - 99.9%)  See Appendix for SVD  In some published literatures, they call Matrix Factorization as SVD, but note it’s NOT the same kind of classical low-rank SVD produced by svdlibc. 7
8. 8. How to Learn Factor Vectors  How do we learn preference factor vectors (a, b, c) and (x, y, z)? proprietary material  Minimize errors on the known ratings To learn the factor min q*. p* ∑ (rui − x ui ) 2 vectors (pu and qi) (u,i)∈k Minimizing Cost Function (Least Squares Problem) rui : actual rating for user u on item I € xui : predicted rating for user u on item I 8
9. 9. Data Normalization  Remove Global mean users proprietary material 1 2 3 4 5 6 7 8 9 10 … n 1 1.5 -.9 -.2 .49 2 .79 ? items 3 0.6 ? .46 ? -.4 4 .39 .82 .76 .69 … .52 .8 m 9
10. 10. Factorization Model   Only Preference factors min ∑ (rui − µ − qT pu ) 2 proprietary material i q*. p* (u,i)∈k To learn the factor vectors (pu and qi) Rating = 4€ Global Preference Mean Factor rui : actual rating of user u on item I u : training rating average bu : user u user bias 10 bi : item i item bias qi : latent factor array of item i pu : later factor array of user u
11. 11. Adding Item Bias and User Bias   Add Item bias and User bias as parameters min ∑ (rui − µ − bi − bu − qT pu ) 2 proprietary material i q*. p* (u,i)∈k To learn Item bias and User bias Rating = 4€ Global Preference Mean Factor rui : actual rating of user u on item I u : training rating average Item Bias User Bias bu : user u user bias 11 bi : item i item bias qi : latent factor array of item i pu : later factor array of user u
12. 12. Regularization   To prevent model overfitting 2 2 ∑ proprietary materialmin (rui − µ − bi − bu − q pu ) + λ ( qi + pu + bi2 + bu ) T i 2 2q*. p* (u,i)∈k Regularization to prevent overfitting Rating = 4 Global Preference Mean Factor rui : actual rating of user u on item I u : training rating average bu : user u user bias Item Bias User Bias bi : item i item bias 12 qi : latent factor array of item i pu : later factor array of user u λ : regularization Parameters €
13. 13. Optimize Factor Vectors  Find optimal factor vectors - minimizing cost function proprietary material  Algorithms:   Stochastic gradient descent   Others: Alternating least squares etc..  Most frequently use:   Stochastic gradient descent 13
14. 14. Matrix Factorization Tuning  Number of Factors in the Preference vectors  Learning Rate of Gradient Descent proprietary material   Best result usually coming from different learning rate for different parameter. Especially user/item bias terms.  Parameters in Factorization Model   Time dependent parameters   Seasonality dependent parameters  Many other considerations ! 14
15. 15. High-Level Implementation Steps  Construct User-Item Matrix (sparse data structure!)  Define factorization model - Cost function proprietary material  Take out global mean  Decide what parameters in the model. (bias, preference factor, anything else? SVD++)  Minimizing cost function - model fitting   Stochastic gradient descent   Alternating least squares  Assemble the predictions  Evaluate predictions (RMSE, MAE etc..)  Continue to tune the model 15
16. 16. Thank you  Any question or comment? proprietary material 16
17. 17. Appendix  Stochastic Gradient Descent  Batch Gradient Descent proprietary material  Singular Value Decomposition (SVD) 17
18. 18. Stochastic Gradient Descent Repeat Until Convergence { for i=1 to m in random order { proprietary material θ j := θ j + α (y (i) − hθ (x ( i) ))x (ji) (for every j) } partial derivative term }€ Your code Here: 18
19. 19. Batch Gradient Descent Repeat Until Convergence { m θ j := θ j + α ∑ (y (i) − hθ (x ( i) ))x (ji) (for every j) proprietary material i=1 partial derivative term }€ Your code Here: 19
20. 20. Singular Value Decomposition (SVD) A = U × S ×VT A U S VT m x r matrix r x r matrix r x n matrix proprietary material m x n matrix€ items items rank = k k<r users users Ak = U k × Sk × VkT 20€

Mar. 7, 2021
• #### koaphai20915

Jan. 24, 2020

Oct. 8, 2019

Oct. 8, 2019
• #### ssuser268d7f

Mar. 15, 2019

Mar. 1, 2018
• #### realgold

Oct. 18, 2017

Apr. 23, 2017

Feb. 2, 2017
• #### ssuser7f4c91

Dec. 13, 2016
• #### Gonnector

Dec. 11, 2016

Nov. 7, 2016
• #### RezaAliyari

Jun. 18, 2016

Jun. 14, 2016
• #### SurendraNathArigela

Apr. 18, 2016
• #### thoi_gian

Mar. 30, 2016
• #### ssuserb8f278

Mar. 17, 2016
• #### engeissa39

Feb. 22, 2016
• #### kunwoopark5

Nov. 13, 2015
• #### siddhaganju

Nov. 10, 2015

#### Vistos

Vistos totais

22.233

No Slideshare

0

De incorporações

0

Número de incorporações

5.305

0