Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Fuzzy Content-Based Audio Retrieval Using Visualization Tools
KTH, School of Electrical Engineering and Computer Science (EECS).
2019 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
Innehållsbaserad hämtning av ljud via visualiseringsverktyg (Swedish)
Abstract [en]

Music composition and sound design in the digital domain often involves sifting through large collections of audio files to find the right sample. Traditionally, this involves searching through metadata such as filenames and descriptors either via text search or by manually searching through folders. This paper presents a fast, scalable method for implementing a search engine in which the contents of audio files are used as queries to retrieve similar audio files. The presented approach applies visualization tools to speed up retrieval time compared to a simple KD-Tree algorithm. Qualitative and quantitative results are presented and benefits and drawbacks of the approach are discussed. While the qualitative results show promise, they are deemed inconclusive. Via the quantitative results, it is found that the application of UMAP yield an order-of-magnitude speed-up at a loss of accuracy and that the approach scales well with larger datasets.

Abstract [sv]

Digital ljuddesign och musikkomposition innebär ofta ett sökande genom stora samlingar av ljudfiler efter rätt sampling. Traditionellt sett innebär detta antingen textsökning via metadata såsom filnamn och tags eller manuell sökning genom filstrukturer. Denna rapport presenterar en snabb, skalbar lösning i form av en sökmotor som möjliggör användandet av en ljudfil för innehållsbaserad sökning som hittar liknande ljudfiler. Den presenterade lösningen använder visualiseringsverktyg för att snabba upp hämtningstiden jämför med enkla KD-tree-algoritmer. Kvalitativa och kvantitativa resultat presenteras och för- och nackdelar med lösningen diskuteras. De kvalitativa resultaten visar på potential men bedöms vara ofullständiga. De kvantitativa resultaten påvisar storleksordningar kortare hämtningstid då UMAP används, dock med sänkt noggrannhet som följd, och lösningen visar sig skala väl med större mängder data.

Place, publisher, year, edition, pages
2019. , p. 61
Series
TRITA-EECS-EX ; 2019:617
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:kth:diva-264514OAI: oai:DiVA.org:kth-264514DiVA, id: diva2:1373925
External cooperation
Plan8
Educational program
Master of Science in Engineering - Computer Science and Technology
Supervisors
Examiners
Available from: 2019-11-28 Created: 2019-11-28 Last updated: 2019-11-28Bibliographically approved

Open Access in DiVA

fulltext(5411 kB)5 downloads
File information
File name FULLTEXT01.pdfFile size 5411 kBChecksum SHA-512
ad659ad08e24c16ad529157795981bb51e5f0d90b194c060b6096fdf8572f645996ce9756a88699bb133e333f62a46d985fe4dd07db2d40615b864e0f99b66f8
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 5 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 9 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf