This document summarizes a research paper that aimed to enable more robust emotion-based music recommendation by predicting how musical mood distributions vary over time using Kalman filtering. It discusses extracting audio features from music, collecting ground truth emotion data from users, preprocessing the data, running experiments to predict time-varying emotion distributions, and analyzing the results. The researchers were able to form accurate estimates of how the emotional distribution evolves over time using a Kalman filtering approach.
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
0907008
1. Prediction of Time Varying Musical Mood Distributions Using
Kalman Filtering
Researched by
Erik M. Schmidt and Youngmoo E. Kim
Music and Entertainment Technology Laboratory (MET-lab)
Presented by
Sanjoy Dutta
Roll: 0907008
Department of Computer Science & Engineering
Khulna University of Engineering & Technology
7. GROUND TRUTH DATA COLLECTION
Participants use a graphical interface in Moodswing game to indicate a dynamic position within the A-V space
to annotate five 30-second music clips. Each subject provides a check against the other, reducing the probability
of nonsense labels.
8. MoodSwings Lite Corpus
Developed a reduced dataset consisting of 15-second music clips from 240 songs, selected using the
original label set, to approximate an even distribution across the four primary quadrants of the A-V
space. These clips were subjected to intense focus within the game in order to form a corpus, referred
to here as MoodSwings Lite, with significantly more labels per song clip, which is used in this
analysis.
9. Data Preprocessing
Time-varying emotion distribution regression results for three example 15-second music clips (markers become
darker as time advances):
Second-by-second labels per song (gray bullet),
Standard deviation of the collected labels over 1-second intervals (red ellipse), and
Standard deviation of the distribution projected from acoustic features in 1-second intervals (blue ellipse).
Heavy amount of noise in the covariance ellipses
Apply preprocessing using a Kalman/Rauch-Tung-Striebel (RTS) smoother.
11. Experiment Analysis
The Kalman mixture provides the best result of any system, using only four clusters they achieve an average KL
of 2.881, which is a significant improvement MLR system at 4.576 and the MLR mixture at 3.179. In terms of
mean error, however, Kalman and MLR mixtures produce nearly identical results, with normalized distances of
about 0.109. not over both the
12. Conclusion
Analysis of state of musical emotion in terms of mathematical representation along with the help of A-V
repsentation. They proposed that they can develop the most accurate representation of their ground truth using a
distribution. Using a Kalman filtering approach, have been able to form robust estimates of their distribution and
how it evolves over time.