Playlist generation for music exploration by defining sets of source songs and target songs and deriving a playlist through metric learning and boundary constraints.
https://github.com/hank5925/mlmdstp
Metric Learning for Music Discovery with Source and Target Playlists
1. Metric Learning for Music Discovery with
Source and Target Playlists
Ying-Shu Kuo
August 12 2015
2. Proposed Idea
No Name Artist
1 Song_A Artist_A
2 Song_B Artist_B
3 Song_C Artist_A
4 Song_D Artist_C
5 Song_E Artist_B
6 Song_F Artist_D
7 Song_G Artist_E
8 Song_H Artist_E
9 Song_I Artist_F
Playlist Your Set Target Set
Search
Song_A
Artist_A
Parameter
= Song
= Your Set
= Target Set
= Others
= Chosen
Playlist
= Similarity
※ x-y axis has no meaning
3. Use Case
Parameter • Explore unknown music genre
(e.g. from Jazz to Metal)
• Get to know your friend’s jam
(e.g. from your favs to her favs)
4. No Name Artist
1 Song_A Artist_A
2 Song_B Artist_B
3 Song_C Artist_A
4 Song_D Artist_C
5 Song_E Artist_B
6 Song_F Artist_D
7 Song_G Artist_E
8 Song_H Artist_E
9 Song_I Artist_F
Playlist Your Set Target Set
Search
Song_A
Artist_A
Parameter
What I need for this
1. Song to play with => Million Song Dataset / Spotify API
2. Music similarity => EchoNest Audio Features
3. Cluster song sets => Metric Learning
4. 2-D Visualization => Dimension Reduction
5. Playlist Generation
5. Million Song Dataset
• Criteria for a good dataset
• Why use MSD?
Bertin-Mahieux, Thierry, et al. "The million song dataset." ISMIR 2011: Proceedings of the 12th International Society for Music Information
Retrieval Conference, October 24-28, 2011, Miami, Florida. University of Miami, 2011.
http://audiocontentanalysis.org/data-sets
Dataset RWC CAL500 GTZAN MusiCLEF MSD
size 465 502 1,000 200,000 1,000,000
has audio Y Y Y Y N*
has metadata Y Y Y (update) ? Y
* A partial of it has 7digital audio preview. All of the songs have content-based features.
6. EchoNest Feature
• Metadata: artist name / song title / album name /
year / duration
• Low-level: segment time / loudness / pitch / timbre
• Time: tempo / time signature / section time / bar time …
http://developer.echonest.com/docs/v4/_static/AnalyzeDocumentation.pdf
8. Metric Learning
• Metric: define the way you measure the distance
between data
http://en.wikipedia.org/wiki/File:Manhattan_distance.svg
Bellet, Aurélien, Amaury Habrard, and Marc Sebban. "A survey on metric learning for feature vectors and structured data." arXiv preprint
arXiv:1306.6709 (2013).
10. Metric Learning
• Metric Learning: learning distance function
Bellet, Aurélien, Amaury Habrard, and Marc Sebban. "A survey on metric learning for feature vectors and structured data." arXiv preprint
arXiv:1306.6709 (2013).
12. Metric Learning – LMNN
Large Margin Nearest Neighbor
Weinberger, Kilian Q., John Blitzer, and Lawrence K. Saul. "Distance metric learning for large margin nearest neighbor classification."
Advances in neural information processing systems. 2005.
NOT the unlabeled one!!!
13. Metric Learning – GB-LMNN
Gradient-Boosted Large Margin Nearest Neighbor
Kedem, Dor, Zhixiang Eddie Xu, and Kilian Q. Weinberger. "Gradient Boosted Large Margin Nearest Neighbors."
• Kernel trick, non-linear
• Gradient Boosted Regression Tree
14. Metric Learning – Evaluation
• Does starting / ending songs cluster?
• Davies–Bouldin Index
15. Metric Learning – Evaluation
10 vs 10 ø LMNN GB-LMNN OASIS
average 9.46 10.85 5.62 12.49
max – 16.43 15.66 13.25
min – 8.89 0.61 11.99
16. Dimension Reduction
• High dimension to low dimension based on constraints
• Keep the distance between data the same
• 2-D visualization
Van der Maaten, Laurens, and Geoffrey Hinton. "Visualizing data using t-SNE." Journal of Machine Learning Research 9.2579-2605 (2008):
85.
17. Dimension Reduction – t-SNE
http://commons.wikimedia.org/wiki/File:T_distribution_1df_enhanced.svg
Van der Maaten, Laurens, and Geoffrey Hinton. "Visualizing data using t-SNE." Journal of Machine Learning Research 9.2579-2605 (2008):
85.
• Pairwise distance
• Effective neighbors = local
• Gaussian vs t-distribution
19. Playlist Generation – Related Work
Zheleva et al.
[1]
McFee et al.
[2]
Chen et al.
[3]
mine
assumption /
constraint
matching
user taste
and song
taste
natural
language
natural
language
2 clusters,
smooth
input
(dataset)
triplet
(user, song, t)
tag 0/1;
content-
based
playlists
content-
based
approach
topic model
(LDA)
Markov chain
ensemble
Markov chain
nearest
neighbors
evaluation
entropy-
based
log likelihood log likelihood ?
[1] Zheleva, Elena, et al. "Statistical models of music-listening sessions in social media." Proceedings of the 19th international conference
on World wide web. ACM, 2010.
[2] McFee, Brian, and Gert RG Lanckriet. "The Natural Language of Playlists." ISMIR. 2011.
[3] Chen, Shuo, et al. "Playlist prediction via metric embedding." Proceedings of the 18th ACM SIGKDD international conference on
Knowledge discovery and data mining. ACM, 2012.
20. Playlist Generation – Related Work
Flexer [4] Van Gulik [5] Lamere [6] mine
assumption /
constraint
specifying
start and end
high-level
control of
playlist
boil the frog
2 clusters,
smooth
input
(dataset)
content-
based
songs with
metadata
songs with
artist info
content-
based
approach
divergence
ratio
visualization
path drawing
artist
similarity
nearest
neighbors
evaluation same genre – – ?
[4] Flexer, Arthur, et al. "Playlist Generation using Start and End Songs." ISMIR. 2008.
[5] Van Gulik, Rob, and Fabio Vignoli. "Visual Playlist Generation on the Artist Map." ISMIR. Vol. 5. 2005.
[6] http://static.echonest.com/frog/
21. Playlist Generation – Method
• number of songs
• threshold
http://www.pstcc.edu/departments/natural_behavioral_sciences/Web%20Physics/E2020D0103.gif