SlideShare uma empresa Scribd logo
1 de 44
Baixar para ler offline
Towards Seed-Free Music Playlist
Generation:
Enhancing Collaborative Filtering with Playlist Title
Information
Jaehun Kim, Minz Won, Cynthia C. S. Liem, Alan Hanjalic
1
Preliminary Approach
First attempt : WRMF
3
● Good Old MF
○ Weighted Regularized Matrix Factorization [1]
■ Developed for implicit feedback
■ ALS* optimization : fast and reliable
■ Only 2~3 hyper parameters
*Alternating Least Square (or Coordinate Descent)
R U~= x V
First attempt : WRMF
4
First attempt : WRMF
5
First attempt : WRMF
6
NO SEED
Average
Winner
First attempt : WRMF
7
● Good Old MF
○ Already reasonable performance
○ Except No-Seed case => Cold Start Problem
● Any metadata or content for playlist?
First attempt : WRMF
8
● Good Old MF
○ Already reasonable performance
○ Except No-Seed case => Cold Start Problem
● Any metadata or content for playlist?
○ Playlist titles!
Playlist Titles
● Text information (implicitly) represents the playlist
● Some key statistics
Playlist Titles
10
# of titles (MPD + Challenge Set) 1,010,000
# of unique titles 93,250
# of unique titles (stemmed) 49,808
Single word ~60%
Less than two words ~92%
● Text information (implicitly) represents the playlist
● Some key statistics
Playlist Titles
11
# of titles (MPD + Challenge Set) 1,010,000
# of unique titles 93,250
# of unique titles (stemmed) 49,808
Single word ~60%
Less than two words ~92%
1. Playlist titles ~= Words
● Noisiness
Playlist Titles
12
Categories Examples
Special characters //Pretty Little Liars//, ** some tunes, ?!?
Repeated characters Yaaaas, summerrrr, partayyy
Shortened words Chillin, Temp, favss
Abbreviated words loml, jb, IDFK, jjjj
Symbolic expressions
Multiple languages otoño, 電台收藏, アニメ
● Noisiness
Playlist Titles
13
Categories Examples
Special characters //Pretty Little Liars//, ** some tunes, ?!?
Repeated characters Yaaaas, summerrrr, partayyy
Shortened words Chillin, Temp, favss
Abbreviated words loml, jb, IDFK, jjjj
Symbolic expressions
Multiple languages otoño, 電台收藏, アニメ
2. Standard word-level approaches
(may be) NOT WORKING
Playlist Titles
14
● Playlist titles ~= words
● Standard word-level approaches (may be) not working
● Character level approach : Character N-GRAM
Character N-gram
15
Input Text “Character N-gram”
Unigram (1-gram) C, h, a, r, a, c, t, e, r, , N, -, g, r, a, m
Bigram (2-gram) Ch, ha, ar, ac, ct, te, er, r , N, N-, -g, ...
Trigram (3-gram) Cha, har, arc, rct, cte, ter, er , r N, N-, ...
Quadrogram (4-gram) Char, harc, arct, rcte, cter, ter , er N, ...
... ...
Character N-gram
16
Input Text “Character N-gram”
Unigram (1-gram) C, h, a, r, a, c, t, e, r, , N, -, g, r, a, m
Bigram (2-gram) Ch, ha, ar, ac, ct, te, er, r , N, N-, -g, ...
Trigram (3-gram) Cha, har, arc, rct, cte, ter, er , r N, N-, ...
Quadrogram (4-gram) Char, harc, arct, rcte, cter, ter , er N, ...
... ...
Cha har arc cte ter er r N N-abc zeb ytj pyk nrc nfe
1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 ...
jkl
Bag-of-Character-N-gram
● Build bag-of-n-grams for each playlist (Train + Test Set)
● For each testing playlist
○ Find M closest playlist in Train set using cosine distance
○ Collect tracks from retrieved playlist
○ Recommend L most popular tracks
Title-based RecSys
NGRAM:Similarity Based
17
Cosine
Distance
Title-based RecSys
NGRAM:Similarity Based
18
Cha har arc cte ter er r N N-abc zeb ytj pyk nrc nfe
1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 ...
jkl
...
...
Testing
Playlists
Training
Playlist
L most popular tracks among
M closest training playlists
Bag-of-Character-N-gram
Title-based RecSys
:: Model Based
19
Title-based RecSys
:: Model Based
20
1.
(pre-trained)
Transfer`Title info’
Title-based RecSys
:: Model Based
21
1. 2.
Substitute
Title-based RecSys
:: Model Based
22
1. 2.
?X. Wang et al. (2014) [2] A. Van den Oord et al. (2013) [3]
Title-based RecSys
:: Model Based
23
1. 2.
+
?!
Proposed Model
Multi-Objective Collab. Filtering
25
MF for Pre-trained Factors
Main objective function for
Multi-Objective Collab. Filtering
26
1. 2.
Multi-Objective Collab. Filtering
27
1. 2.
RNCF : Recurrent Neural CF
28
Results
Results:Overall
30
● It works! (10th on final Leaderboard)
Results:Overall
31
● Superior over simple baselines
Results:Overall
32
● WRMF > SVD
HRNCF : Hybrid RNCF
33
R U~= x V
Filling missing
factors
Results:Overall
34
● HRNCF: best of both worlds
Results:Overall
35
● Simple NGRAM distance may have
given us very similar performance
Hyperparameters: WRMF
36
Hyperparameters: WRMF
37
Hyperparameters: RNCF
38
Hyperparameters: RNCF
39
Hyperparameters: RNCF
40
Discussion & Take away
Take away
42
● MF is still powerful
● Setting up right (internal) evaluation setup is more important than model
● Software engineering DOES MATTER
○ Since the scalability DOES MATTER
○ Since hyper-parameter tuning DOES MATTER
● Deep learning is not a magic wand
○ No Free Lunch
○ It costs a LOT
● Content-based algorithms still gives small (but significant) gain to CF
Thank you!
Code: https://github.com/eldrin/recsys18-spotify-spotif-ai
43
References
44
[1] Hu, Yifan, Yehuda Koren, and Chris Volinsky. "Collaborative filtering for implicit feedback datasets." Data Mining, 2008.
ICDM'08. Eighth IEEE International Conference on. Ieee, 2008.
[2] Wang, Xinxi, and Ye Wang. "Improving content-based and hybrid music recommendation using deep learning."
Proceedings of the 22nd ACM international conference on Multimedia. ACM, 2014.
[3] Van den Oord, Aäron, Sander Dieleman, and Benjamin Schrauwen. "Deep content-based music recommendation."
Advances in neural information processing systems. 2013.

Mais conteúdo relacionado

Último

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessWSO2
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...itnewsafrica
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 

Último (20)

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with Platformless
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 

Towards Seed-Free Music Playlist Generation: Enhancing Collaborative Filtering with Playlist Title Information