Using the Signature Quadratic Form Distance for Music Information Retrieval
This thesis is an investigation into how the signature quadratic form distance can be used to search in music.
Using the method used for images by Beecks, Uysal and Seidl as a starting point,
I create feature signatures from sound clips by clustering features from their frequency representations.
I compare three different feature types, based on Fourier coefficients, mel frequency cepstrum coefficients (MFCCs), and the chromatic scale.
Two search applications are considered.
First, an audio fingerprinting system, where a music file is located by a short recorded clip from the song.
I run experiments to see how the system's parameters affect the search quality, and show that it achieves some robustness to noise in the queries, though less so that comparable state-of-the-art methods.
Second, a query-by-humming system where humming or singing by one user is used to search in humming/singing by other users.
Here none of the tested feature types achieve satisfactory search performance. I identify and discuss some possible limitations of the selected feature types for this task.
I believe that this thesis serves to demonstrate the versatility of the feature clustering approach, and may serve as a starting point for further research.
Place, publisher, year, edition, pages
Institutt for datateknikk og informasjonsvitenskap , 2011. , 41 p.
ntnudaim:6270, MTDT datateknikk, Komplekse datasystemer
IdentifiersURN: urn:nbn:no:ntnu:diva-14472Local ID: ntnudaim:6270OAI: oai:DiVA.org:ntnu-14472DiVA: diva2:454083
Hetland, Magnus Lie, Førsteamanuensis