Just Count the Love-Hate Squares

•

1 gostou•1,110 visualizações

This document proposes a method for recommender systems that counts different configurations ("squares") in the user-item bipartite rating network to predict whether a user will rate an item highly. It involves counting the number of each configuration for every user-item pair to generate features, then training a machine learning classifier on these features. The method was applied to the KDD Cup 2011 Yahoo! Music Dataset competition and achieved competitive results, with enhancements like normalizing against random networks and separating counts based on item hierarchy. Interestingly, configurations involving "hate" edges were most predictive of a user's potential love for an item.

Tecnologia Educação

Just Count the Love-Hate Squares:
a Rating Network Based Method for
Recommender Systems
KDD Cup 2011
August 21, 2011

Joseph Kong, Kyle Teague, Justin Kessler

Approved for public release by Northrop Grumman Information Systems, ISHQ-2011-0042

Link Prediction in Bipartite Rating Network

1 2 3 4 Items
80

20 100 90 50
?

A B Users
1 2 3 4 Items
+

- + + -
?

A B Users
•  Solid edges represent the observed rating pattern

•  Score >= 80 ( I-love-it, “+” ); score < 80 ( I-hate-it, “-” );

2 •  Goal: predict whether unobserved link is highly rated?

Motivation: Happy Hour with Brock and Donald

Song 1
+ Brock +
Song 2 Donald
- +
? - ? +
- +
- +
Me - Me +
- +

•  Happy hour chat: with Brock, there are 3 songs that we
both hate; with Donald, we find 3 songs we both love.

•  Now, Brock loves Song 1 and Donald loves Song 2

•  Am I more likely to love Song 1 or Song 2?

•  Main idea: the presence of certain type of square may be
3
highly indicative of love/hate; so, just count them!

The Square Counting Method: How to Count

- + - +
? 0 - ? 1 - ? 2 + ? 3 +
- - - -
- + - +
? 4 - ? 5 - ? 6 + ? 7 +
+ + + +
Configuration No. denoted in middle

•  Given user-item (utg-itg) pair: Count number of each
configuration and form feature vector

•  For example, in right Fig., the path (utg-i1-u1-itg), which has a
sign sequence of {-,+,-}, corresponds to configuration No. 2
(see left Fig.); thus, the count for configuration No. 2 is 1.
4

The Square Counting Method: Machine Learning

•  Counts for different square configurations form the features.

•  Construct the validation set with user-item pairs with known ratings.

•  Machine learning framework:

1.  Perform square counting on rating network for each user-item pair in the
validation set and generate the validation instance-feature matrix.

2.  Train a machine learned classifier on validation instance-feature matrix.

3.  Repeat square counting on the rating network for the test set and generate the
test instance-feature matrix.

4.  Apply the machine learned classifier for each instance in the test instance-
feature matrix.

5

KDD Cup Track 2-Yahoo! Music Dataset

•  Goal is to develop algorithms to separate which ratings were
highly rated by a user (score >=80) and which were not.

•  For each user in the test set, 6 songs were given; out of the 6
songs, 3 songs were highly rated by the user and 3 songs were
not (task is to distinguish them)

•  Winners are determined by the error rate on a hold-out test set

Statistic Count
Users 249,012
Items 296,111
Ratings 62,551,438
Training Ratings 61,944,406
Test Ratings 607,032

Summary of Results-KDD Cup Track 2

•  Enhancements •  Square counting
–  Normalizing square counts –  Generate feature-instance matrix
against random network model –  Implemented in C++/OpenMP
–  Separate counts based on item –  ~ 5 hr on 8-core workstation (2 GB
hierarchy RAM)
–  Further edge categorization
•  Machine learning: ~1 hr
–  Removing very popular items
–  Using bias-removed scores

7

Hate is a Powerful Signal in Predicting Love

•  Logistic regression coefficients (in 10-3) for each love-hate
square configuration in predicting a user's highly rated items

•  Interesting observation: most powerful configs for predicting
a user’s love for an item comes from hate edges: config. No.
1 & 4 (2nd top row; 1st bottom row).

•  Config. No. 1 (2nd top row) means: Item X is recommended
to you because you hate items Y and Z!
8

Mais conteúdo relacionado

Semelhante a Just Count the Love-Hate Squares

Overlapping community detection survey

煜林车

ML Label engineering and N-Hot Encoders

Mor Krispil

At some point in your software engineer career, you will have to deal with data and your success depends on how big the data that your software can deal with. From a simple problem that requires processing a large amount of data, this talk will present to you how to approach this kind of issue and how to design and choose an efficient solution. About speaker: Hồ is Senior Software Engineer at AXON where he helps design and develops complex distributed systems, including image and video encoding, distributed file conversion system. Besides coding, Ho likes to read manga and meet friends in his free time.

Grokking Techtalk #37: Data intensive problem

Grokking VN

Enar short course

Deepak Agarwal

An early look at the LDBC Social Network Benchmark's Business Intelligence wo...

Gábor Szárnyas

22期.百度彭滔搜索引擎评估与用户行为分析

Janwen Lou

Mmclass3

Hassan Dar

Final Presentation - Edan&Itzik

itzik cohen

Social network analysis

Caleb Jones

Domainspecificsubgraph extraction ieee-bigdata2016

Sarasi Sarangi

Domainspecificsubgraph extraction ieee-bigdata2016

Artificial Intelligence Institute at UofSC

Parking space detect

Amanullah Tariq

A look inside pandas design and development

Wes McKinney

Game Programming 07 - Procedural Content Generation

Nick Pruehs

object detection paper review

Yoonho Na

Numerical Linear Algebra for Data and Link Analysis

Leonid Zhukov

Multi-label graph analysis and computations using GraphX

Qingbo Hu

In real-life applications, we often deal with situations where analysis needs to be conducted on graphs where the nodes and edges are associated with multiple labels. For example, in a graph that represents user activities in social networks, the labels associated with nodes may indicate their membership in communities (e.g. group, school, company, etc.), and the labels associated with edges may denote types of activities (e.g. comment, like, share, etc.). The current GraphX library in Spark does not directly support efficient calculation on the label-defined subgraph analysis and computations. In this session, the speakers will propose a general API library that is able to support analysis on multi-label graphs, and can be reused and extended to design more complicated algorithms. It includes a method to create multi-label graphs and calculate basic statistics and metrics at both the global and subgraph level. Common graph algorithms, such as PageRank, can also be efficiently implemented in a parallel scheme by reusing the module/algorithm in GraphX, such as Pregel API. See how LinkedIn is able to leverage this tool to efficiently find top LinkedIn feed influencers in different communities and by different actions. can be reused and extended to design more complicated algorithms. It includes a method to create multi-label graphs and calculate basic statistics and metrics at both the global and subgraph level. Common graph algorithms, such as PageRank, can also be efficiently implemented in a parallel scheme by reusing the module/algorithm in GraphX, such as Pregel API. See how LinkedIn is able to leverage this tool to efficiently find top LinkedIn feed influencers in different communities and by different actions.

Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...

Databricks

Scoring at Scale: Generating Follow Recommendations for Over 690 Million Link...

Databricks

[SNU Computer Vision Course Project] Image Style Recognition

Hunjae Jung

Semelhante a Just Count the Love-Hate Squares (20)

Overlapping community detection survey

ML Label engineering and N-Hot Encoders

Grokking Techtalk #37: Data intensive problem

Enar short course

An early look at the LDBC Social Network Benchmark's Business Intelligence wo...

22期.百度彭滔搜索引擎评估与用户行为分析

Mmclass3

Final Presentation - Edan&Itzik

Social network analysis

Domainspecificsubgraph extraction ieee-bigdata2016

Parking space detect

A look inside pandas design and development

Game Programming 07 - Procedural Content Generation

object detection paper review

Numerical Linear Algebra for Data and Link Analysis

Multi-label graph analysis and computations using GraphX

Multi-Label Graph Analysis and Computations Using GraphX with Qiang Zhu and Q...

Scoring at Scale: Generating Follow Recommendations for Over 690 Million Link...

[SNU Computer Vision Course Project] Image Style Recognition

Último

The microservices honeymoon is over. When starting a new project or revamping a legacy monolith, teams started looking for alternatives to microservices. The Modular Monolith, or 'Modulith', is an architecture that reaps the benefits of (vertical) functional decoupling without the high costs associated with separate deployments. This talk will delve into the advantages and challenges of this progressive architecture, beginning with exploring the concept of a 'module', its internal structure, public API, and inter-module communication patterns. Supported by spring-modulith, the talk provides practical guidance on addressing the main challenges of a Modultith Architecture: finding and guarding module boundaries, data decoupling, and integration module-testing. You should not miss this talk if you are a software architect or tech lead seeking practical, scalable solutions. About the author With two decades of experience, Victor is a Java Champion working as a trainer for top companies in Europe. Five thousands developers in 120 companies attended his workshops, so he gets to debate every week the challenges that various projects struggle with. In return, Victor summarizes key points from these workshops in conference talks and online meetups for the European Software Crafters, the world’s largest developer community around architecture, refactoring, and testing. Discover how Victor can help you on victorrentea.ro : company training catalog, consultancy and YouTube playlists.

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024

Victor Rentea

Exploring Multimodal Embeddings with Milvus

Zilliz

Dubai, known for its towering skyscrapers, luxurious lifestyle, and relentless pursuit of innovation, often finds itself in the global spotlight. However, amidst the glitz and glamour, the emirate faces its own set of challenges, including the occasional threat of flooding. In recent years, Dubai has experienced sporadic but significant floods, disrupting normalcy and posing unique challenges to its infrastructure. Among the critical nodes in this bustling metropolis is the Dubai International Airport, a vital hub connecting the world. This article delves into the intersection of Dubai flood events and the resilience demonstrated by the Dubai International Airport in the face of such challenges.

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

Orbitshub

Corporate and higher education. Two industries that, in the past, have had a clear divide with very little crossover. The difference in goals, learning styles and objectives paved the way for differing learning technologies platforms to evolve. Now, those stark lines are blurring as both sides are discovering they have content that’s relevant to the other. Join Tammy Rutherford as she walks through the pros and cons of corporate and higher ed collaborating. And the challenges of these different technology platforms working together for a brighter future.

Corporate and higher education May webinar.pptx

Rustici Software

How to Troubleshoot Apps for the Modern Connected Worker

ThousandEyes

Retrieval augmented generation (RAG) is the most popular style of large language model application to emerge from 2023. The most basic style of RAG works by vectorizing your data and injecting it into a vector database like Milvus for retrieval to augment the text output generated by an LLM. This is just the beginning. One of the ways that we can extend RAG, and extend AI, is through multilingual use cases. Typical RAG is done in English using embedding models that are trained in English. In this talk, we’ll explore how RAG could work in languages other than English. We’ll explore French, Chinese, and Polish.

Introduction to Multilingual Retrieval Augmented Generation (RAG)

Zilliz

[BuildWithAI] Introduction to Gemini.pdf

Sandro Moreira

Understanding the FAA Part 107 License ..

Christopher Logan Kennedy

💥 You’re lucky! We’ve found two different (lead) developers that are willing to share their valuable lessons learned about using UiPath Document Understanding! Based on recent implementations in appealing use cases at Partou and SPIE. Don’t expect fancy videos or slide decks, but real and practical experiences that will help you with your own implementations. 📕 Topics that will be addressed: • Training the ML-model by humans: do or don't? • Rule-based versus AI extractors • Tips for finding use cases • How to start 👨‍🏫👨‍💻 Speakers: o Dion Morskieft, RPA Product Owner @Partou o Jack Klein-Schiphorst, Automation Developer @Tacstone Technology

DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam

UiPathCommunity

Keynote 2: APIs in 2030: The Risk of Technological Sleepwalk Paolo Malinverno, Growth Advisor - The Business of Technology Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...

apidays

MS Copilot expands with MS Graph connectors

Nanddeep Nachan

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Product Anonymous

💉💊+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI}}+971581248768 +971581248768 Mtp-Kit (500MG) Prices » Dubai [(+971581248768**)] Abortion Pills For Sale In Dubai, UAE, Mifepristone and Misoprostol Tablets Available In Dubai, UAE CONTACT DR.Maya Whatsapp +971581248768 We Have Abortion Pills / Cytotec Tablets /Mifegest Kit Available in Dubai, Sharjah, Abudhabi, Ajman, Alain, Fujairah, Ras Al Khaimah, Umm Al Quwain, UAE, Buy cytotec in Dubai +971581248768''''Abortion Pills near me DUBAI | ABU DHABI|UAE. Price of Misoprostol, Cytotec” +971581248768' Dr.DEEM ''BUY ABORTION PILLS MIFEGEST KIT, MISOPROTONE, CYTOTEC PILLS IN DUBAI, ABU DHABI,UAE'' Contact me now via What's App…… abortion Pills Cytotec also available Oman Qatar Doha Saudi Arabia Bahrain Above all, Cytotec Abortion Pills are Available In Dubai / UAE, you will be very happy to do abortion in Dubai we are providing cytotec 200mg abortion pill in Dubai, UAE. Medication abortion offers an alternative to Surgical Abortion for women in the early weeks of pregnancy. We only offer abortion pills from 1 week-6 Months. We then advise you to use surgery if its beyond 6 months. Our Abu Dhabi, Ajman, Al Ain, Dubai, Fujairah, Ras Al Khaimah (RAK), Sharjah, Umm Al Quwain (UAQ) United Arab Emirates Abortion Clinic provides the safest and most advanced techniques for providing non-surgical, medical and surgical abortion methods for early through late second trimester, including the Abortion By Pill Procedure (RU 486, Mifeprex, Mifepristone, early options French Abortion Pill), Tamoxifen, Methotrexate and Cytotec (Misoprostol). The Abu Dhabi, United Arab Emirates Abortion Clinic performs Same Day Abortion Procedure using medications that are taken on the first day of the office visit and will cause the abortion to occur generally within 4 to 6 hours (as early as 30 minutes) for patients who are 3 to 12 weeks pregnant. When Mifepristone and Misoprostol are used, 50% of patients complete in 4 to 6 hours; 75% to 80% in 12 hours; and 90% in 24 hours. We use a regimen that allows for completion without the need for surgery 99% of the time. All advanced second trimester and late term pregnancies at our Tampa clinic (17 to 24 weeks or greater) can be completed within 24 hours or less 99% of the time without the need surgery. The procedure is completed with minimal to no complications. Our Women's Health Center located in Abu Dhabi, United Arab Emirates, uses the latest medications for medical abortions (RU-486, Mifeprex, Mifegyne, Mifepristone, early options French abortion pill), Methotrexate and Cytotec (Misoprostol). The safety standards of our Abu Dhabi, United Arab Emirates Abortion Doctors remain unparalleled. They consistently maintain the lowest complication rates throughout the nation. Our Physicians and staff are always available to answer questions and care for women in one of the most difficult times in their lives. The decision to have an abortion at the Abortion Cl

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

Vector Search -An Introduction in Oracle Database 23ai.pptx

Remote DBA Services

Elevate Developer Efficiency & build GenAI Application with Amazon Q

Bhuvaneswari Subramani

FWD Group - Insurer Innovation Award 2024

The Digital Insurer

Whatsapp Number Escorts Call girls 8617370543 Available 24x7 Mcleodganj Call Girls Service Offer Genuine VIP Model Escorts Call Girls in Your Budget. Mcleodganj Call Girls Service Provide Real Call Girls Number. Make Your Sexual Pleasure Memorable with Our Mcleodganj Call Girls at Affordable Price. Top VIP Escorts Call Girls, High Profile Independent Escorts Call Girls, Housewife Women Escorts Call Girl, College Girls Escorts Call Girls, Russian Escorts Call girls Service in Your Budget.

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model

Deepika Singh

presentation ICT roal in 21st century education

jfdjdjcjdnsjd

DBX First Quarter 2024 Investor Presentation

Dropbox

AWS Community Day CPH - Three problems of Terraform

Andrey Devyatkin

Just Count the Love-Hate Squares

1. Just Count the Love-Hate Squares: a Rating Network Based Method for Recommender Systems KDD Cup 2011 August 21, 2011 Joseph Kong, Kyle Teague, Justin Kessler Approved for public release by Northrop Grumman Information Systems, ISHQ-2011-0042

2. Link Prediction in Bipartite Rating Network 1 2 3 4 Items 80 20 100 90 50 ? A B Users 1 2 3 4 Items + - + + - ? A B Users •  Solid edges represent the observed rating pattern •  Score >= 80 ( I-love-it, “+” ); score < 80 ( I-hate-it, “-” ); 2 •  Goal: predict whether unobserved link is highly rated?

3. Motivation: Happy Hour with Brock and Donald Song 1 + Brock + Song 2 Donald - + ? - ? + - + - + Me - Me + - + •  Happy hour chat: with Brock, there are 3 songs that we both hate; with Donald, we find 3 songs we both love. •  Now, Brock loves Song 1 and Donald loves Song 2 •  Am I more likely to love Song 1 or Song 2? •  Main idea: the presence of certain type of square may be 3 highly indicative of love/hate; so, just count them!

4. The Square Counting Method: How to Count - + - + ? 0 - ? 1 - ? 2 + ? 3 + - - - - - + - + ? 4 - ? 5 - ? 6 + ? 7 + + + + + Configuration No. denoted in middle •  Given user-item (utg-itg) pair: Count number of each configuration and form feature vector •  For example, in right Fig., the path (utg-i1-u1-itg), which has a sign sequence of {-,+,-}, corresponds to configuration No. 2 (see left Fig.); thus, the count for configuration No. 2 is 1. 4

5. The Square Counting Method: Machine Learning •  Counts for different square configurations form the features. •  Construct the validation set with user-item pairs with known ratings. •  Machine learning framework: 1.  Perform square counting on rating network for each user-item pair in the validation set and generate the validation instance-feature matrix. 2.  Train a machine learned classifier on validation instance-feature matrix. 3.  Repeat square counting on the rating network for the test set and generate the test instance-feature matrix. 4.  Apply the machine learned classifier for each instance in the test instance- feature matrix. 5

6. KDD Cup Track 2-Yahoo! Music Dataset •  Goal is to develop algorithms to separate which ratings were highly rated by a user (score >=80) and which were not. •  For each user in the test set, 6 songs were given; out of the 6 songs, 3 songs were highly rated by the user and 3 songs were not (task is to distinguish them) •  Winners are determined by the error rate on a hold-out test set Statistic Count Users 249,012 Items 296,111 Ratings 62,551,438 Training Ratings 61,944,406 Test Ratings 607,032

7. Summary of Results-KDD Cup Track 2 •  Enhancements •  Square counting –  Normalizing square counts –  Generate feature-instance matrix against random network model –  Implemented in C++/OpenMP –  Separate counts based on item –  ~ 5 hr on 8-core workstation (2 GB hierarchy RAM) –  Further edge categorization •  Machine learning: ~1 hr –  Removing very popular items –  Using bias-removed scores 7

8. Hate is a Powerful Signal in Predicting Love •  Logistic regression coefficients (in 10-3) for each love-hate square configuration in predicting a user's highly rated items •  Interesting observation: most powerful configs for predicting a user’s love for an item comes from hate edges: config. No. 1 & 4 (2nd top row; 1st bottom row). •  Config. No. 1 (2nd top row) means: Item X is recommended to you because you hate items Y and Z! 8

Just Count the Love-Hate Squares

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a Just Count the Love-Hate Squares

Semelhante a Just Count the Love-Hate Squares (20)

Último

Último (20)

Just Count the Love-Hate Squares