Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Scale-space theory for auditory signals
KTH, School of Computer Science and Communication (CSC), Computational Biology, CB.ORCID iD: 0000-0002-9081-2170
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.ORCID iD: 0000-0003-2926-6518
2015 (English)In: Scale Space and Variational Methods in Computer Vision: 5th International Conference, SSVM 2015, Lège-Cap Ferret, France, May 31 - June 4, 2015, Proceedings / [ed] J.-F. Aujol et al., Springer, 2015, Vol. 9087, 3-15 p.Conference paper, Published paper (Refereed)
Abstract [en]

We show how the axiomatic structure of scale-space theory can be applied to the auditory domain and be used for deriving idealized models of auditory receptive fields via scale-space principles. For defining a time-frequency transformation of a purely temporal signal, it is shown that the scale-space framework allows for a new way of deriving the Gabor and Gammatone filters as well as a novel family of generalized Gammatone filters with additional degrees of freedom to obtain different trade-offs between the spectral selectivity and the temporal delay of time-causal window functions. Applied to the definition of a second layer of receptive fields from the spectrogram, it is shown that the scale-space framework leads to two canonical families of spectro-temporal receptive fields, using a combination of Gaussian filters over the logspectral domain with either Gaussian filters or a cascade of first-order integrators over the temporal domain. These spectro-temporal receptive fields can be either separable over the time-frequency domain or be adapted to local glissando transformations that represent variations in logarithmic frequencies over time. Such idealized models of auditory receptive fields respect auditory invariances, can be used for computing basic auditory features for audio processing and lead to predictions about auditory receptive fields with good qualitative similarity to biological receptive fields in the inferior colliculus (ICC) and the primary auditory cortex (A1).

Place, publisher, year, edition, pages
Springer, 2015. Vol. 9087, 3-15 p.
Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 9087
Keyword [en]
Computation theory, Computer vision, Degrees of freedom (mechanics), Economic and social effects, Frequency domain analysis, Gammatone filters, Inferior colliculus, Log-spectral domain, Logarithmic frequency, Scale-space theory, Spectral selectivity, Time frequency domain, Time-frequency transformation
National Category
Computer Science Bioinformatics (Computational Biology) Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:kth:diva-160481DOI: 10.1007/978-3-319-18461-6_1Scopus ID: 2-s2.0-84931078597ISBN: 978-3-319-18461-6 (print)OAI: oai:DiVA.org:kth-160481DiVA: diva2:789795
Conference
SSVM 2015: Fifth International Conference on Scale Space and Variational Methods in Computer Vision, Lège Cap Ferret, France, 31 May - 4 June, 2015
Funder
Swedish Research Council, 2010-4766,2012-4685,2014-4083EU, FP7, Seventh Framework Programme, FET-Open 618067
Note

QC 20150407

Available from: 2015-02-20 Created: 2015-02-20 Last updated: 2016-04-28Bibliographically approved

Open Access in DiVA

fulltext(1385 kB)99 downloads
File information
File name FULLTEXT01.pdfFile size 1385 kBChecksum SHA-512
347d4781492a323129fdcf4e803bafc0e42e986571a645339ef7567928fad5ad1873c5a137d2dc4ec3a830f95bc3f8097f9ef061845c6b4782a37fbfa0814c5c
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopusAt authors' home pageThe final publication is available at www.springerlink.com

Search in DiVA

By author/editor
Lindeberg, TonyFriberg, Anders
By organisation
Computational Biology, CBSpeech, Music and Hearing, TMH
Computer ScienceBioinformatics (Computational Biology)Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 99 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 1663 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf