SlideShare uma empresa Scribd logo
1 de 8
Baixar para ler offline
Just Count the Love-Hate Squares:
a Rating Network Based Method for
           Recommender Systems
                                                 KDD Cup 2011
                                                        August 21, 2011


          Joseph Kong, Kyle Teague, Justin Kessler




                      Approved for public release by Northrop Grumman Information Systems, ISHQ-2011-0042
Link Prediction in Bipartite Rating Network

                1               2                 3           4   Items
                                                      80

                20        100       90       50
                                                      ?

                      A                  B                        Users
                 1              2                 3           4   Items
                                                          +

                 -         +        +        -
                                                      ?

                       A                 B                         Users
    •  Solid edges represent the observed rating pattern

    •  Score >= 80 ( I-love-it, “+” ); score < 80 ( I-hate-it, “-” );

2   •  Goal: predict whether unobserved link is highly rated?
Motivation: Happy Hour with Brock and Donald


       Song 1
                        +               Brock                    +
                                                Song 2                       Donald
                                -                                    +
                ?                   -                    ?               +
                    -                                        +
                                        -                                    +
         Me             -                         Me             +
                            -                                        +

     •  Happy hour chat: with Brock, there are 3 songs that we
        both hate; with Donald, we find 3 songs we both love.

     •  Now, Brock loves Song 1 and Donald loves Song 2

     •  Am I more likely to love Song 1 or Song 2?

     •  Main idea: the presence of certain type of square may be
3
        highly indicative of love/hate; so, just count them!
The Square Counting Method: How to Count

     -           +           -            +
?    0   -   ?   1   -   ?   2    +   ?   3    +
     -           -           -            -
     -           +           -            +
?    4   -   ?   5   -   ?   6    +   ?   7    +
     +           +           +            +
         Configuration No. denoted in middle

•  Given user-item (utg-itg) pair: Count number of each
   configuration and form feature vector

•  For example, in right Fig., the path (utg-i1-u1-itg), which has a
   sign sequence of {-,+,-}, corresponds to configuration No. 2
   (see left Fig.); thus, the count for configuration No. 2 is 1.
4
The Square Counting Method: Machine Learning

    •  Counts for different square configurations form the features.

    •  Construct the validation set with user-item pairs with known ratings.

    •  Machine learning framework:

    1.    Perform square counting on rating network for each user-item pair in the
          validation set and generate the validation instance-feature matrix.

    2.    Train a machine learned classifier on validation instance-feature matrix.

    3.    Repeat square counting on the rating network for the test set and generate the
          test instance-feature matrix.

    4.    Apply the machine learned classifier for each instance in the test instance-
          feature matrix.



5
KDD Cup Track 2-Yahoo! Music Dataset


•  Goal is to develop algorithms to separate which ratings were
   highly rated by a user (score >=80) and which were not.

•  For each user in the test set, 6 songs were given; out of the 6
   songs, 3 songs were highly rated by the user and 3 songs were
   not (task is to distinguish them)

•  Winners are determined by the error rate on a hold-out test set

                               Statistic          Count
                               Users              249,012
                               Items              296,111
                               Ratings            62,551,438
                               Training Ratings   61,944,406
                               Test Ratings       607,032
Summary of Results-KDD Cup Track 2




    •  Enhancements                       •  Square counting
       –  Normalizing square counts           –  Generate feature-instance matrix
          against random network model        –  Implemented in C++/OpenMP
       –  Separate counts based on item       –  ~ 5 hr on 8-core workstation (2 GB
          hierarchy                              RAM)
       –  Further edge categorization
                                          •  Machine learning: ~1 hr
       –  Removing very popular items
       –  Using bias-removed scores


7
Hate is a Powerful Signal in Predicting Love




    •  Logistic regression coefficients (in 10-3) for each love-hate
       square configuration in predicting a user's highly rated items

    •  Interesting observation: most powerful configs for predicting
       a user’s love for an item comes from hate edges: config. No.
       1 & 4 (2nd top row; 1st bottom row).

    •  Config. No. 1 (2nd top row) means: Item X is recommended
       to you because you hate items Y and Z!
8

Mais conteúdo relacionado

Semelhante a Just Count the Love-Hate Squares

Final Presentation - Edan&Itzik
Final Presentation - Edan&ItzikFinal Presentation - Edan&Itzik
Final Presentation - Edan&Itzik
itzik cohen
 
A look inside pandas design and development
A look inside pandas design and developmentA look inside pandas design and development
A look inside pandas design and development
Wes McKinney
 
Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...
Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...
Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...
Databricks
 

Semelhante a Just Count the Love-Hate Squares (20)

Overlapping community detection survey
Overlapping community detection surveyOverlapping community detection survey
Overlapping community detection survey
 
ML Label engineering and N-Hot Encoders
ML Label engineering and N-Hot EncodersML Label engineering and N-Hot Encoders
ML Label engineering and N-Hot Encoders
 
Grokking Techtalk #37: Data intensive problem
 Grokking Techtalk #37: Data intensive problem Grokking Techtalk #37: Data intensive problem
Grokking Techtalk #37: Data intensive problem
 
Enar short course
Enar short courseEnar short course
Enar short course
 
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
 
22期.百度彭滔 搜索引擎评估与用户行为分析
22期.百度彭滔 搜索引擎评估与用户行为分析22期.百度彭滔 搜索引擎评估与用户行为分析
22期.百度彭滔 搜索引擎评估与用户行为分析
 
Mmclass3
Mmclass3Mmclass3
Mmclass3
 
Final Presentation - Edan&Itzik
Final Presentation - Edan&ItzikFinal Presentation - Edan&Itzik
Final Presentation - Edan&Itzik
 
Social network analysis
Social network analysisSocial network analysis
Social network analysis
 
Domainspecificsubgraph extraction ieee-bigdata2016
Domainspecificsubgraph extraction ieee-bigdata2016Domainspecificsubgraph extraction ieee-bigdata2016
Domainspecificsubgraph extraction ieee-bigdata2016
 
Domainspecificsubgraph extraction ieee-bigdata2016
Domainspecificsubgraph extraction ieee-bigdata2016Domainspecificsubgraph extraction ieee-bigdata2016
Domainspecificsubgraph extraction ieee-bigdata2016
 
Parking space detect
Parking space detectParking space detect
Parking space detect
 
A look inside pandas design and development
A look inside pandas design and developmentA look inside pandas design and development
A look inside pandas design and development
 
Game Programming 07 - Procedural Content Generation
Game Programming 07 - Procedural Content GenerationGame Programming 07 - Procedural Content Generation
Game Programming 07 - Procedural Content Generation
 
object detection paper review
object detection paper reviewobject detection paper review
object detection paper review
 
Numerical Linear Algebra for Data and Link Analysis
Numerical Linear Algebra for Data and Link AnalysisNumerical Linear Algebra for Data and Link Analysis
Numerical Linear Algebra for Data and Link Analysis
 
Multi-label graph analysis and computations using GraphX
Multi-label graph analysis and computations using GraphXMulti-label graph analysis and computations using GraphX
Multi-label graph analysis and computations using GraphX
 
Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...
Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...
Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...
 
Scoring at Scale: Generating Follow Recommendations for Over 690 Million Link...
Scoring at Scale: Generating Follow Recommendations for Over 690 Million Link...Scoring at Scale: Generating Follow Recommendations for Over 690 Million Link...
Scoring at Scale: Generating Follow Recommendations for Over 690 Million Link...
 
[SNU Computer Vision Course Project] Image Style Recognition
[SNU Computer Vision Course Project] Image Style Recognition[SNU Computer Vision Course Project] Image Style Recognition
[SNU Computer Vision Course Project] Image Style Recognition
 

Último

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

Just Count the Love-Hate Squares

  • 1. Just Count the Love-Hate Squares: a Rating Network Based Method for Recommender Systems KDD Cup 2011 August 21, 2011 Joseph Kong, Kyle Teague, Justin Kessler Approved for public release by Northrop Grumman Information Systems, ISHQ-2011-0042
  • 2. Link Prediction in Bipartite Rating Network 1 2 3 4 Items 80 20 100 90 50 ? A B Users 1 2 3 4 Items + - + + - ? A B Users •  Solid edges represent the observed rating pattern •  Score >= 80 ( I-love-it, “+” ); score < 80 ( I-hate-it, “-” ); 2 •  Goal: predict whether unobserved link is highly rated?
  • 3. Motivation: Happy Hour with Brock and Donald Song 1 + Brock + Song 2 Donald - + ? - ? + - + - + Me - Me + - + •  Happy hour chat: with Brock, there are 3 songs that we both hate; with Donald, we find 3 songs we both love. •  Now, Brock loves Song 1 and Donald loves Song 2 •  Am I more likely to love Song 1 or Song 2? •  Main idea: the presence of certain type of square may be 3 highly indicative of love/hate; so, just count them!
  • 4. The Square Counting Method: How to Count - + - + ? 0 - ? 1 - ? 2 + ? 3 + - - - - - + - + ? 4 - ? 5 - ? 6 + ? 7 + + + + + Configuration No. denoted in middle •  Given user-item (utg-itg) pair: Count number of each configuration and form feature vector •  For example, in right Fig., the path (utg-i1-u1-itg), which has a sign sequence of {-,+,-}, corresponds to configuration No. 2 (see left Fig.); thus, the count for configuration No. 2 is 1. 4
  • 5. The Square Counting Method: Machine Learning •  Counts for different square configurations form the features. •  Construct the validation set with user-item pairs with known ratings. •  Machine learning framework: 1.  Perform square counting on rating network for each user-item pair in the validation set and generate the validation instance-feature matrix. 2.  Train a machine learned classifier on validation instance-feature matrix. 3.  Repeat square counting on the rating network for the test set and generate the test instance-feature matrix. 4.  Apply the machine learned classifier for each instance in the test instance- feature matrix. 5
  • 6. KDD Cup Track 2-Yahoo! Music Dataset •  Goal is to develop algorithms to separate which ratings were highly rated by a user (score >=80) and which were not. •  For each user in the test set, 6 songs were given; out of the 6 songs, 3 songs were highly rated by the user and 3 songs were not (task is to distinguish them) •  Winners are determined by the error rate on a hold-out test set Statistic Count Users 249,012 Items 296,111 Ratings 62,551,438 Training Ratings 61,944,406 Test Ratings 607,032
  • 7. Summary of Results-KDD Cup Track 2 •  Enhancements •  Square counting –  Normalizing square counts –  Generate feature-instance matrix against random network model –  Implemented in C++/OpenMP –  Separate counts based on item –  ~ 5 hr on 8-core workstation (2 GB hierarchy RAM) –  Further edge categorization •  Machine learning: ~1 hr –  Removing very popular items –  Using bias-removed scores 7
  • 8. Hate is a Powerful Signal in Predicting Love •  Logistic regression coefficients (in 10-3) for each love-hate square configuration in predicting a user's highly rated items •  Interesting observation: most powerful configs for predicting a user’s love for an item comes from hate edges: config. No. 1 & 4 (2nd top row; 1st bottom row). •  Config. No. 1 (2nd top row) means: Item X is recommended to you because you hate items Y and Z! 8