Musical genre classification using Nonnegative Matrix Factorization based features
2008 (English)In: IEEE Transactions on Audio, Speech and Language Processing, Vol. 16, no 2, 424-434 p.Article in journal (Refereed) Published
Nonnegative matrix factorization (NMF) is used to derive a novel description for the timbre of musical sounds. Using NMF, a spectrogram is factorized providing a characteristic spectral basis. Assuming a set of spectrograms given a musical genre, the space spanned by the vectors of the obtained spectral bases is modeled statistically using mixtures of Gaussians, resulting in a description of the spectral base for this musical genre. This description is shown to improve classification results by up to 23.3% compared to MFCC-based models, while the compression performed by the factorization decreases training time significantly. Using a distance-based stability measure this compression is shown to reduce the noise present in the data set resulting in more stable classification models. In addition, we compare the mean squared errors of the approximation to a spectrogram using independent component analysis and nonnegative matrix factorization, showing the superiority of the latter approach.
Place, publisher, year, edition, pages
IEEE Press, 2008. Vol. 16, no 2, 424-434 p.
Audio classification; Audio feature extraction; Music information retrieval; Nonnegative matrix factorization
Research subject Computer Science; Media Technology; Speech and Music Communication
IdentifiersURN: urn:nbn:se:kth:diva-193764DOI: 10.1109/TASL.2007.909434ISI: 000252612100016ScopusID: 2-s2.0-39649092019OAI: oai:DiVA.org:kth-193764DiVA: diva2:1040351