O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

51

Compartilhar

Introduction to Matrix Factorization Methods Collaborative Filtering

To download slides:
http://www.intelligentmining.com/category/knowledge-base/

Introduction to Matrix Factorization Methods Collaborative Filtering

Livros relacionados

Gratuito durante 30 dias do Scribd

Ver tudo

Audiolivros relacionados

Gratuito durante 30 dias do Scribd

Ver tudo

Introduction to Matrix Factorization Methods Collaborative Filtering

  1. 1. INTRODUCTION TO MATRIX FACTORIZATION proprietary material METHODS COLLABORATIVE FILTERING USER RATINGS PREDICTION1 Alex Lin Senior Architect Intelligent Mining
  2. 2. Outline  Factoranalysis  Matrix decomposition proprietary material  Matrix Factorization Model  Minimizing Cost Function  Common Implementation 2
  3. 3. Factor Analysis  Aprocedure can help identify the factors that might be used to explain the interrelationships among the variables proprietary material  Model based approach 3
  4. 4. Refresher: Matrix Decomposition r q p 5 x 6 matrix 5 x 3 matrix 3 x 6 matrixX11 X12 X13 X14 X15 X16 x proprietary materialX21 X22 X12 X24 X25 X26 yX31 X32 X33 X34 X35 X36 a b c zX41 X42 X43 X44 X45 X46X51 X52 X53 X54 X55 X56X32 = (a, b, c) . (x, y, z) = a * x + b * y + c * z Rating Prediction rui = qT pu ˆ i User Preference Factor Vector 4 Movie Preference Factor Vector
  5. 5. Making Prediction as Filling MissingValue users 1 2 3 4 5 6 7 8 9 10 … n proprietary material 1 5 4 4 3 2 3 ? items 3 3 ? 5 ? 2 4 3 3 4 3 4 4 … m rui = qT pu ˆ i User Preference Factor Vector 5 Rating Prediction Movie Preference Factor Vector €
  6. 6. Learn Factor Vectors users 1 2 3 4 5 6 7 8 9 10 … n 1 5 4 4 3 proprietary material 2 3 ? items 3 3 ? 5 ? 2 4 3 3 4 3 4 4 … 4 = U3-1 * I1-1 + U3-2 * I1-2 + U3-3 * I1-3 + U3-4 * I1-4 m 3 = U7-1 * I2-1 + U7-2 * I2-2 + U7-3 * I2-3 + U7-4 * I2-4 ….. 3 = U86-1 * I12-1 + U86-2 * I12-2 + U86-3 * I12-3 + U86-4 * I12-4 Note: only train on known entries 2X + 3Y = 5 2X + 3Y = 5 6 4X - 2Y = 2 4X - 2Y = 2 3X - 2Y = 2
  7. 7. Why not use standard SVD?  Standard SVD assumes all missing entries are zero. This leads to bad prediction accuracy, especially when dataset is extremely sparse. (98% proprietary material - 99.9%)  See Appendix for SVD  In some published literatures, they call Matrix Factorization as SVD, but note it’s NOT the same kind of classical low-rank SVD produced by svdlibc. 7
  8. 8. How to Learn Factor Vectors  How do we learn preference factor vectors (a, b, c) and (x, y, z)? proprietary material  Minimize errors on the known ratings To learn the factor min q*. p* ∑ (rui − x ui ) 2 vectors (pu and qi) (u,i)∈k Minimizing Cost Function (Least Squares Problem) rui : actual rating for user u on item I € xui : predicted rating for user u on item I 8
  9. 9. Data Normalization  Remove Global mean users proprietary material 1 2 3 4 5 6 7 8 9 10 … n 1 1.5 -.9 -.2 .49 2 .79 ? items 3 0.6 ? .46 ? -.4 4 .39 .82 .76 .69 … .52 .8 m 9
  10. 10. Factorization Model   Only Preference factors min ∑ (rui − µ − qT pu ) 2 proprietary material i q*. p* (u,i)∈k To learn the factor vectors (pu and qi) Rating = 4€ Global Preference Mean Factor rui : actual rating of user u on item I u : training rating average bu : user u user bias 10 bi : item i item bias qi : latent factor array of item i pu : later factor array of user u
  11. 11. Adding Item Bias and User Bias   Add Item bias and User bias as parameters min ∑ (rui − µ − bi − bu − qT pu ) 2 proprietary material i q*. p* (u,i)∈k To learn Item bias and User bias Rating = 4€ Global Preference Mean Factor rui : actual rating of user u on item I u : training rating average Item Bias User Bias bu : user u user bias 11 bi : item i item bias qi : latent factor array of item i pu : later factor array of user u
  12. 12. Regularization   To prevent model overfitting 2 2 ∑ proprietary materialmin (rui − µ − bi − bu − q pu ) + λ ( qi + pu + bi2 + bu ) T i 2 2q*. p* (u,i)∈k Regularization to prevent overfitting Rating = 4 Global Preference Mean Factor rui : actual rating of user u on item I u : training rating average bu : user u user bias Item Bias User Bias bi : item i item bias 12 qi : latent factor array of item i pu : later factor array of user u λ : regularization Parameters €
  13. 13. Optimize Factor Vectors  Find optimal factor vectors - minimizing cost function proprietary material  Algorithms:   Stochastic gradient descent   Others: Alternating least squares etc..  Most frequently use:   Stochastic gradient descent 13
  14. 14. Matrix Factorization Tuning  Number of Factors in the Preference vectors  Learning Rate of Gradient Descent proprietary material   Best result usually coming from different learning rate for different parameter. Especially user/item bias terms.  Parameters in Factorization Model   Time dependent parameters   Seasonality dependent parameters  Many other considerations ! 14
  15. 15. High-Level Implementation Steps  Construct User-Item Matrix (sparse data structure!)  Define factorization model - Cost function proprietary material  Take out global mean  Decide what parameters in the model. (bias, preference factor, anything else? SVD++)  Minimizing cost function - model fitting   Stochastic gradient descent   Alternating least squares  Assemble the predictions  Evaluate predictions (RMSE, MAE etc..)  Continue to tune the model 15
  16. 16. Thank you  Any question or comment? proprietary material 16
  17. 17. Appendix  Stochastic Gradient Descent  Batch Gradient Descent proprietary material  Singular Value Decomposition (SVD) 17
  18. 18. Stochastic Gradient Descent Repeat Until Convergence { for i=1 to m in random order { proprietary material θ j := θ j + α (y (i) − hθ (x ( i) ))x (ji) (for every j) } partial derivative term }€ Your code Here: 18
  19. 19. Batch Gradient Descent Repeat Until Convergence { m θ j := θ j + α ∑ (y (i) − hθ (x ( i) ))x (ji) (for every j) proprietary material i=1 partial derivative term }€ Your code Here: 19
  20. 20. Singular Value Decomposition (SVD) A = U × S ×VT A U S VT m x r matrix r x r matrix r x n matrix proprietary material m x n matrix€ items items rank = k k<r users users Ak = U k × Sk × VkT 20€
  • LinZheng31

    Mar. 7, 2021
  • koaphai20915

    Jan. 24, 2020
  • KohJohn2

    Oct. 8, 2019
  • LLXU

    Oct. 8, 2019
  • ssuser268d7f

    Mar. 15, 2019
  • maipeople

    Mar. 1, 2018
  • realgold

    Oct. 18, 2017
  • AmirAssadi4

    Apr. 23, 2017
  • DaniloAscione

    Feb. 2, 2017
  • ssuser7f4c91

    Dec. 13, 2016
  • Gonnector

    Dec. 11, 2016
  • asifiiit

    Nov. 7, 2016
  • RezaAliyari

    Jun. 18, 2016
  • dimpooyadala

    Jun. 14, 2016
  • SurendraNathArigela

    Apr. 18, 2016
  • thoi_gian

    Mar. 30, 2016
  • ssuserb8f278

    Mar. 17, 2016
  • engeissa39

    Feb. 22, 2016
  • kunwoopark5

    Nov. 13, 2015
  • siddhaganju

    Nov. 10, 2015

To download slides: http://www.intelligentmining.com/category/knowledge-base/ Introduction to Matrix Factorization Methods Collaborative Filtering

Vistos

Vistos totais

22.233

No Slideshare

0

De incorporações

0

Número de incorporações

5.305

Ações

Baixados

0

Compartilhados

0

Comentários

0

Curtir

51

×