Endre søk
Begrens søket
123 1 - 50 of 134
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Treff pr side
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
Merk
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1. Almansa, A.
    et al.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Fingerprint enhancement by shape adaptation of scale-space operators with automatic scale selection2000Inngår i: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 9, nr 12, s. 2027-2042Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This work presents two mechanisms for processing fingerprint images; shape-adapted smoothing based on second moment descriptors and automatic scale selection based on normalized derivatives. The shape adaptation procedure adapts the smoothing operation to the local ridge structures, which allows interrupted ridges to be joined without destroying essential singularities such as branching points and enforces continuity of their directional fields. The Scale selection procedure estimates local ridge width and adapts the amount of smoothing to the local amount of noise. In addition, a ridgeness measure is defined, which reflects how well the local image structure agrees with a qualitative ridge model, and is used for spreading the results of shape adaptation into noisy areas. The combined approach makes it possible to resolve fine scale structures in clear areas while reducing the risk of enhancing noise in blurred or fragmented areas. The result is a reliable and adaptively detailed estimate of the ridge orientation field and ridge width, as well as a Smoothed grey-level version of the input image. We propose that these general techniques should be of interest to developers of automatic fingerprint identification systems as well as in other applications of processing related types of imagery.

  • 2. Almansa, Andrés
    et al.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Enhancement of Fingerprint Images by Shape-Adapted Scale-Space Operators1996Inngår i: Gaussian Scale-Space Theory. Part I: Proceedings of PhD School on Scale-Space Theory (Copenhagen, Denmark) May 1996 / [ed] J. Sporring, M. Nielsen, L. Florack, and P. Johansen, Springer Science+Business Media B.V., 1996, s. 21-30Kapittel i bok, del av antologi (Fagfellevurdert)
    Abstract [en]

    This work presents a novel technique for preprocessing fingerprint images. The method is based on the measurements of second moment descriptors and shape adaptation of scale-space operators with automatic scale selection (Lindeberg 1994). This procedure, which has been successfully used in the context of shape-from-texture and shape from disparity gradients, has several advantages when applied to fingerprint image enhancement, as observed by (Weickert 1995). For example, it is capable of joining interrupted ridges, and enforces continuity of their directional fields.

    In this work, these abovementioned general ideas are applied and extended in the following ways: Two methods for estimating local ridge width are explored and tuned to the problem of fingerprint enhancement. A ridgeness measure is defined, which reflects how well the local image structure agrees with a qualitative ridge model. This information is used for guiding a scale-selection mechanism, and for spreading the results of shape adaptation into noisy areas.

    The combined approach makes it possible to resolve fine scale structures in clear areas while reducing the risk of enhancing noise in blurred or fragmented areas. To a large extent, the scheme has the desirable property of joining interrupted lines without destroying essential singularities such as branching points. Thus, the result is a reliable and adaptively detailed estimate of the ridge orientation field and ridge width, as well as a smoothed grey-level version of the input image.

    A detailed experimental evaluation is presented, including a comparison with other techniques. We propose that the techniques presented provide mechanisms of interest to developers of automatic fingerprint identification systems.

  • 3. Björkman, Eva
    et al.
    Zagal, Juan Cristobal
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Roland, Per E.
    Evaluation of design options for the scale-space primal sketch analysis of brain activation images2000Inngår i: : HBM'00, published in Neuroimage, volume 11, number 5, 2000, 2000, Vol. 11, s. 656-656Konferansepaper (Fagfellevurdert)
    Abstract [en]

    A key issue in brain imaging concerns how to detect the functionally activated regions from PET and fMRI images. In earlier work, it has been shown that the scale-space primal sketch provides a useful tool for such analysis [1]. The method includes presmoothing with different filter widths and automatic estimation of the spatial extent of the activated regions (blobs).

    The purpose is to present two modifications of the scale-space primal sketch, as well as a quantitative evaluation which shows that these modifications improve the performance, measured as the separation between blob descriptors extracted from PET images and from noise images. This separation is essential for future work of associating a statistical p-value with the scale-space blob descriptors.

  • 4.
    Bretzner, Lars
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Laptev, Ivan
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Hand-gesture recognition using multi-scale colour features, hierarchical features and particle filtering2002Inngår i: Fifth IEEE International Conference on Automatic Face and Gesture Recognition, 2002. Proceedings, IEEE conference proceedings, 2002, s. 63-74Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper presents algorithms and a prototype systemfor hand tracking and hand posture recognition. Hand posturesare represented in terms of hierarchies of multi-scalecolour image features at different scales, with qualitativeinter-relations in terms of scale, position and orientation. Ineach image, detection of multi-scale colour features is performed.Hand states are then simultaneously detected andtracked using particle filtering, with an extension of layeredsampling referred to as hierarchical layered sampling. Experimentsare presented showing that the performance ofthe system is substantially improved by performing featuredetection in colour space and including a prior with respectto skin colour. These components have been integrated intoa real-time prototype system, applied to a test problem ofcontrolling consumer electronics using hand gestures. In asimplified demo scenario, this system has been successfullytested by participants at two fairs during 2001.

  • 5.
    Bretzner, Lars
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Laptev, Ivan
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Lenman, S.
    Sundblad, Y.
    A Prototype System for Computer Vision Based Human Computer Interaction2001Rapport (Annet vitenskapelig)
  • 6.
    Bretzner, Lars
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Feature Tracking with Automatic Selection of Spatial Scales1998Inngår i: Computer Vision and Image Understanding, ISSN 1077-3142, E-ISSN 1090-235X, Vol. 71, nr 3, s. 385-393Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    When observing a dynamic world, the size of image structures may vary over time. This article emphasizes the need for including explicit mechanisms for automatic scale selection in feature tracking algorithms in order to: (i) adapt the local scale of processing to the local image structure, and (ii) adapt to the size variations that may occur over time. The problems of corner detection and blob detection are treated in detail, and a combined framework for feature tracking is presented. The integrated tracking algorithm overcomes some of the inherent limitations of exposing fixed-scale tracking methods to image sequences in which the size variations are large. It is also shown how the stability over time of scale descriptors can be used as a part of a multi-cue similarity measure for matching. Experiments on real-world sequences are presented showing the performance of the algorithm when applied to (individual) tracking of corners and blobs.

  • 7.
    Bretzner, Lars
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Feature tracking with automatic selection of spatial scales1998Rapport (Annet vitenskapelig)
    Abstract [en]

    When observing a dynamic world, the size of image structures may vary over nada. This article emphasizes the need for including explicit mechanisms for automatic scale selection in feature tracking algorithms in order to: (i) adapt the local scale of processing to the local image structure, and (ii) adapt to the size variations that may occur over time.

    The problems of corner detection and blob detection are treated in detail, and a combined framework for feature tracking is presented in which the image features at every time moment are detected at locally determined and automatically selected nadaes. A useful property of the scale selection method is that the scale levels selected in the feature detection step reflect the spatial extent of the image structures. Thereby, the integrated tracking algorithm has the ability to adapt to spatial as well as temporal size variations, and can in this way overcome some of the inherent limitations of exposing fixed-scale tracking methods to image sequences in which the size variations are large.

    In the composed tracking procedure, the scale information is used for two additional major purposes: (i) for defining local regions of interest for searching for matching candidates as well as setting the window size for correlation when evaluating matching candidates, and (ii) stability over time of the scale and significance descriptors produced by the scale selection procedure are used for formulating a multi-cue similarity measure for matching.

    Experiments on real-world sequences are presented showing the performance of the algorithm when applied to (individual) tracking of corners and blobs. Specifically, comparisons with fixed-scale tracking methods are included as well as illustrations of the increase in performance obtained by using multiple cues in the feature matching step.

  • 8.
    Bretzner, Lars
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    On the handling of spatial and temporal scales in feature tracking1997Inngår i: Scale-Space Theory in Computer Vision: First International Conference, Scale-Space'97 Utrecht, The Netherlands, July 2–4, 1997 Proceedings, Springer Berlin/Heidelberg, 1997, s. 128-139Konferansepaper (Fagfellevurdert)
  • 9.
    Bretzner, Lars
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Qualitative Multi-Scale Feature Hierarchies for Object Tracking2000Inngår i: Journal of Visual Communication and Image Representation, ISSN 1047-3203, E-ISSN 1095-9076, Vol. 11, s. 115-129Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This paper shows how the performance of feature trackers can be improved by building a view-based object representation consisting of qualitative relations between image structures at different scales. The idea is to track all image features individually, and to use the qualitative feature relations for resolving ambiguous matches and for introducing feature hypotheses whenever image features are mismatched or lost. Compared to more traditional work on view-based object tracking, this methodology has the ability to handle semi-rigid objects and partial occlusions. Compared to trackers based on three-dimensional object models, this approach is much simpler and of a more generic nature. A hands-on example is presented showing how an integrated application system can be constructed from conceptually very simple operations.

  • 10.
    Bretzner, Lars
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Qualitative multiscale feature hierarchies for object tracking2000Rapport (Fagfellevurdert)
    Abstract [en]

    This paper shows how the performance of feature trackers can be improved by building a hierarchical view-based object representation consisting of qualitative relations between image structures at different scales. The idea is to track all image features individually and to use the qualitative feature relations for avoiding mismatches, for resolving ambiguous matches, and for introducing feature hypotheses whenever image features are lost. Compared to more traditional work on view-based object tracking, this methodology has the ability to handle semirigid objects and partial occlusions. Compared to trackers based on three-dimensional object models, this approach is much simpler and of a more generic nature. A hands-on example is presented showing how an integrated application system can be constructed from conceptually very simple operations.

  • 11.
    Bretzner, Lars
    et al.
    KTH, Tidigare Institutioner (före 2005), Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner (före 2005), Numerisk analys och datalogi, NADA.
    Qualitative multi-scale feature hierarchies for object tracking1999Inngår i: Proc Scale-Space Theories in Computer Vision Med, Elsevier, 1999, s. 117-128Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper shows how the performance of feature trackers can be improved by building a view-based object representation consisting of qualitative relations between image structures at different scales. The idea is to track all image features individually, and to use the qualitative feature relations for resolving ambiguous matches and for introducing feature hypotheses whenever image features are mismatched or lost. Compared to more traditional work on view-based object tracking, this methodology has the ability to handle semi-rigid objects and partial occlusions. Compared to trackers based on three-dimensional object models, this approach is much simpler and of a more generic nature. A hands-on example is presented showing how an integrated application system can be constructed from conceptually very simple operations.

  • 12.
    Bretzner, Lars
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Structure and Motion Estimation using Sparse Point and Line Correspondences in Multiple Affine Views1999Rapport (Annet vitenskapelig)
    Abstract [en]

    This paper addresses the problem of computing three-dimen\-sional structure and motion from an unknown rigid configuration of points and lines viewed by an affine projection model. An algebraic structure, analogous to the trilinear tensor for three perspective cameras, is defined for configurations of three centered affine cameras. This centered affine trifocal tensor contains 12 non-zero coefficients and involves linear relations between point correspondences and trilinear relations between line correspondences. It is shown how the affine trifocal tensor relates to the perspective trilinear tensor, and how three-dimensional motion can be computed from this tensor in a straightforward manner. A factorization approach is developed to handle point features and line features simultaneously in image sequences, and degenerate feature configurations are analysed. This theory is applied to a specific problem in human-computer interaction of capturing three-dimensional rotations from gestures of a human hand. This application to quantitative gesture analyses illustrates the usefulness of the affine trifocal tensor in a situation where sufficient information is not available to compute the perspective trilinear tensor, while the geometry requires point correspondences as well as line correspondences over at least three views.

  • 13.
    Bretzner, Lars
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Use your hand as a 3-D mouse or relative orientation from extended sequences of sparse point and line correspondances using the affine trifocal tensor1998Inngår i: Computer Vision — ECCV'98: 5th European Conference on Computer Vision Freiburg, Germany, June, 2–6, 1998 Proceedings, Volume I, Springer Berlin/Heidelberg, 1998, Vol. 1406, s. 141-157Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper addresses the problem of computing three-dimensional structure and motion from an unknown rigid configuration of point and lines viewed by an affine projection model. An algebraic structure, analogous to the trilinear tensor for three perspective cameras, is defined for configurations of three centered affine cameras. This centered affine trifocal tensor contains 12 coefficients and involves linear relations between point correspondences and trilinear relations between line correspondences It is shown how the affine trifocal tensor relates to the perspective trilinear tensor, and how three-dimensional motion can be computed from this tensor in a straightforward manner. A factorization approach is also developed to handle point features and line features simultaneously in image sequences.

    This theory is applied to a specific problem of human-computer interaction of capturing three-dimensional rotations from gestures of a human hand. A qualitative model is presented, in which three fingers are represented by their position and orientation, and it is shown how three point correspondences (blobs at the finger tips) and three line correspondences (ridge features at the fingers) allow the affine trifocal tensor to be determined, from which the rotation is computed. Besides the obvious application, this test problem illustrates the usefulness of the affine trifocal tensor in a situation where sufficient information is not available to compute the perspective trilinear tensor, while the geometry requires point correspondences as well as line correspondences over at least three views.

  • 14.
    Brunnström, Kjell
    et al.
    KTH, Tidigare Institutioner (före 2005), Numerisk analys och datalogi, NADA.
    Eklundh, Jan-Olof
    KTH, Tidigare Institutioner (före 2005), Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    On Scale and Resolution in the Analysis of Local Image Structure1990Inngår i: Proc. 1st European Conf. on Computer Vision, 1990, Vol. 427, s. 3-12Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Focus-of-attention is extremely important in human visual perception. If computer vision systems are to perform tasks in a complex, dynamic world they will have to be able to control processing in a way that is analogous to visual attention in humans.

    In this paper we will investigate problems in connection with foveation, that is examining selected regions of the world at high resolution. We will especially consider the problem of finding and classifying junctions from this aspect. We will show that foveation as simulated by controlled, active zooming in conjunction with scale-space techniques allows robust detection and classification of junctions.

  • 15.
    Brunnström, Kjell
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Eklundh, Jan-Olof
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Scale and Resolution in Active Analysis of Local Image Structure1990Inngår i: Image and Vision Computing, Vol. 8, s. 289-296Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Focus-of-attention is extremely important in human visual perception. If computer vision systems are to perform tasks in a complex, dynamic world they will have to be able to control processing in a way that is analogous to visual attention in humans. Problems connected to foveation (examination of selected regions of the world at high resolution) are examined. In particular, the problem of finding and classifying junctions from this aspect is considered. It is shown that foveation as simulated by controlled, active zooming in conjunction with scale-space techniques allows for robust detection and classification of junctions.

  • 16.
    Brunnström, Kjell
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Eklundh, Jan-Olof
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Active detection and classification of junctions by foveation with a head-eye system guided by the scale-space primal sketch1992Inngår i: Computer Vision — ECCV'92: Second European Conference on Computer Vision Santa Margherita Ligure, Italy, May 19–22, 1992 Proceedings / [ed] Guilo Sandini, Springer Berlin/Heidelberg, 1992, s. 701-709Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We consider how junction detection and classification can be performed in an active visual system. This is to exemplify that feature detection and classification in general can be done by both simple and robust methods, if the vision system is allowed to look at the world rather than at prerecorded images. We address issues on how to attract the attention to salient local image structures, as well as on how to characterize those.

  • 17.
    Ekeberg, Örjan
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Fransén, Erik
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Hellgren Kotaleski, Jeanette
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Herman, Pawel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Kumar, Arvind
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Lansner, Anders
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Computational Brain Science at CST, CSC, KTH2016Annet (Annet vitenskapelig)
    Abstract [en]

    Mission and Vision - Computational Brain Science Lab at CST, CSC, KTH

    The scientific mission of the Computational Brain Science Lab at CSC is to be at the forefront of mathematical modelling, quantitative analysis and mechanistic understanding of brain function. We perform research on (i) computational modelling of biological brain function and on (ii) developing theory, algorithms and software for building computer systems that can perform brain-like functions. Our research answers scientific questions and develops methods in these fields. We integrate results from our science-driven brain research into our work on brain-like algorithms and likewise use theoretical results about artificial brain-like functions as hypotheses for biological brain research.

    Our research on biological brain function includes sensory perception (vision, hearing, olfaction, pain), cognition (action selection, memory, learning) and motor control at different levels of biological detail (molecular, cellular, network) and mathematical/functional description. Methods development for investigating biological brain function and its dynamics as well as dysfunction comprises biomechanical simulation engines for locomotion and voice, machine learning methods for analysing functional brain images, craniofacial morphology and neuronal multi-scale simulations. Projects are conducted in close collaborations with Karolinska Institutet and Karolinska Hospital in Sweden as well as other laboratories in Europe, U.S., Japan and India.

    Our research on brain-like computing concerns methods development for perceptual systems that extract information from sensory signals (images, video and audio), analysis of functional brain images and EEG data, learning for autonomous agents as well as development of computational architectures (both software and hardware) for neural information processing. Our brain-inspired approach to computing also applies more generically to other computer science problems such as pattern recognition, data analysis and intelligent systems. Recent industrial collaborations include analysis of patient brain data with MentisCura and the startup company 13 Lab bought by Facebook.

    Our long term vision is to contribute to (i) deeper understanding of the computational mechanisms underlying biological brain function and (ii) better theories, methods and algorithms for perceptual and intelligent systems that perform artificial brain-like functions by (iii) performing interdisciplinary and cross-fertilizing research on both biological and artificial brain-like functions. 

    On one hand, biological brains provide existence proofs for guiding our research on artificial perceptual and intelligent systems. On the other hand, applying Richard Feynman’s famous statement ”What I cannot create I do not understand” to brain science implies that we can only claim to fully understand the computational mechanisms underlying biological brain function if we can build and implement corresponding computational mechanisms on a computerized system that performs similar brain-like functions.

  • 18.
    Friberg, Anders
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Lindeberg, Tony
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Hellwagner, Martin
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Helgason, Pétur
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Salomão, Gláucia Laís
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Elovsson, Anders
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Lemaitre, Guillaume
    Institute for Research and Coordination in Acoustics and Music, Paris, France.
    Ternström, Sten
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Prediction of three articulatory categories in vocal sound imitations using models for auditory receptive fields2018Inngår i: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 144, nr 3, s. 1467-1483Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Vocal sound imitations provide a new challenge for understanding the coupling between articulatory mechanisms and the resulting audio. In this study, we have modeled the classification of three articulatory categories, phonation, supraglottal myoelastic vibrations, and turbulence from audio recordings. Two data sets were assembled, consisting of different vocal imitations by four professional imitators and four non-professional speakers in two different experiments. The audio data were manually annotated by two experienced phoneticians using a detailed articulatory description scheme. A separate set of audio features was developed specifically for each category using both time-domain and spectral methods. For all time-frequency transformations, and for some secondary processing, the recently developed Auditory Receptive Fields Toolbox was used. Three different machine learning methods were applied for predicting the final articulatory categories. The result with the best generalization was found using an ensemble of multilayer perceptrons. The cross-validated classification accuracy was 96.8 % for phonation, 90.8 % for supraglottal myoelastic vibrations, and 89.0 % for turbulence using all the 84 developed features. A final feature reduction to 22 features yielded similar results.

  • 19.
    Gårding, Jonas
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    CanApp: The Candela Application Library1989Rapport (Annet vitenskapelig)
    Abstract [en]

    This paper describes CanApp, the Candela Application Library. CanApp is a software package for image processing and image analysis. Most of the subroutines in CanApp are available both as stand-alone programs and C subroutines.

    CanApp currently comprises some 50 programs and 75 subroutines, and these numbers are expected to grow continuously as a result of joint efforts of the members of the CVAP group at the Royal Institute of Technology in Stockholm.

    CanApp is currently installed and running under UNIX on Sun workstations

  • 20.
    Gårding, Jonas
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Direct computation of shape cues using scale-adapted spatial derivative operators1996Inngår i: International Journal of Computer Vision, ISSN 0920-5691, E-ISSN 1573-1405, Vol. 17, nr 2, s. 163-191Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This paper addresses the problem of computing cues to the three-dimensional structure of surfaces in the world directly from the local structure of the brightness pattern of either a single monocular image or a binocular image pair.It is shown that starting from Gaussian derivatives of order up to two at a range of scales in scale-space, local estimates of (i) surface orientation from monocular texture foreshortening, (ii) surface orientation from monocular texture gradients, and (iii) surface orientation from the binocular disparity gradient can be computed without iteration or search, and by using essentially the same basic mechanism.The methodology is based on a multi-scale descriptor of image structure called the windowed second moment matrix, which is computed with adaptive selection of both scale levels and spatial positions. Notably, this descriptor comprises two scale parameters; a local scale parameter describing the amount of smoothing used in derivative computations, and an integration scale parameter determining over how large a region in space the statistics of regional descriptors is accumulated.Experimental results for both synthetic and natural images are presented, and the relation with models of biological vision is briefly discussed.

  • 21.
    Gårding, Jonas
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Direct estimation of local surface shape in a fixating binocular vision system1994Inngår i: Computer Vision — ECCV '94: Third European Conference on Computer Vision Stockholm, Sweden, May 2–6, 1994 Proceedings, Volume I, Springer Berlin/Heidelberg, 1994, s. 365-376Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper addresses the problem of computing cues to the three-dimensional structure of surfaces in the world directly from the local structure of the brightness pattern of a binocular image pair. The geometric information content of the gradient of binocular disparity is analyzed for the general case of a fixating vision system with symmetric or asymmetric vergence, and with either known or unknown viewing geometry. A computationally inexpensive technique which exploits this analysis is proposed. This technique allows a local estimate of surface orientation to be computed directly from the local statistics of the left and right image brightness gradients, without iterations or search. The viability of the approach is demonstrated with experimental results for both synthetic and natural gray-level images.

  • 22.
    Jansson, Ylva
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields2017Rapport (Annet vitenskapelig)
    Abstract [en]

    This work presents a first evaluation of using spatiotemporal receptive fields from a recently proposed time-causal spatio-temporal scale-space framework as primitives for video analysis. We propose a new family of video descriptors based on regional statistics of spatio-temporal receptive field responses and evaluate this approach on the problem of dynamic texture recognition. Our approach generalises a previously used method, based on joint histograms of receptive field responses, from the spatial to the spatio-temporal domain and from object recognition to dynamic texture recognition. The time-recursive formulation enables computationally efficient time-causal recognition.

    The experimental evaluation demonstrates competitive performance compared to state-of-the-art. Especially, it is shown that binary versions of our dynamic texture descriptors achieve improved performance compared to a large range of similar methods using different primitives either handcrafted or learned from data. Further, our qualitative and quantitative investigation into parameter choices and the use of different sets of receptive fields highlights the robustness and flexibility of our approach. Together, these results support the descriptive power of this family of time-causal spatio-temporal receptive fields, validate our approach for dynamic texture recognition and point towards the possibility of designing a range of video analysis methods based on these new time-causal spatio-temporal primitives.

  • 23.
    Jansson, Ylva
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Lindeberg, Tony
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields2018Inngår i: Journal of Mathematical Imaging and Vision, ISSN 0924-9907, E-ISSN 1573-7683, Vol. 60, nr 9, s. 1369-1398Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This work presents a first evaluation of using spatio-temporal receptive fields from a recently proposed time-causal spatiotemporal scale-space framework as primitives for video analysis. We propose a new family of video descriptors based on regional statistics of spatio-temporal receptive field responses and evaluate this approach on the problem of dynamic texture recognition. Our approach generalises a previously used method, based on joint histograms of receptive field responses, from the spatial to the spatio-temporal domain and from object recognition to dynamic texture recognition. The time-recursive formulation enables computationally efficient time-causal recognition. The experimental evaluation demonstrates competitive performance compared to state of the art. In particular, it is shown that binary versions of our dynamic texture descriptors achieve improved performance compared to a large range of similar methods using different primitives either handcrafted or learned from data. Further, our qualitative and quantitative investigation into parameter choices and the use of different sets of receptive fields highlights the robustness and flexibility of our approach. Together, these results support the descriptive power of this family of time-causal spatio-temporal receptive fields, validate our approach for dynamic texture recognition and point towards the possibility of designing a range of video analysis methods based on these new time-causal spatio-temporal primitives.

  • 24.
    Jansson, Ylva
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Dynamic texture recognition using time-causal spatio-temporal scale-space filters2017Inngår i: Scale Space and Variational Methods in Computer Vision, Springer, 2017, Vol. 10302, s. 16-28Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This work presents an evaluation of using time-causal scale-space filters as primitives for video analysis. For this purpose, we present a new family of video descriptors based on regional statistics of spatiotemporal scale-space filter responses and evaluate this approach on the problem of dynamic texture recognition. Our approach generalises a previously used method, based on joint histograms of receptive field responses, from the spatial to the spatio-temporal domain. We evaluate one member in this family, constituting a joint binary histogram, on two widely used dynamic texture databases. The experimental evaluation shows competitive performance compared to previous methods for dynamic texture recognition, especially on the more complex DynTex database. These results support the descriptive power of time-causal spatio-temporal scale-space filters as primitives for video analysis.

  • 25. Laptev, I.
    et al.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    A distance measure and a feature likelihood map concept for scale-invariant model matching2003Rapport (Fagfellevurdert)
    Abstract [en]

    This paper presents two approaches for evaluating multi-scale feature-based object models. Within the first approach, a scale-invariant distance measure is proposed for comparing two image representations in terms of multi-scale features. Based on this measure, the maximisation of the likelihood of parameterised feature models allows for simultaneous model selection and parameter estimation. The idea of the second approach is to avoid an explicit feature extraction step and to evaluate models using a function defined directly from the image data. For this purpose, we propose the concept of a feature likelihood map, which is a function normalised to the interval [0, 1], and that approximates the likelihood of image features at all points in scale-space. To illustrate the applicability of both methods, we consider the area of hand gesture analysis and show how the proposed evaluation schemes can be integrated within a particle filtering approach for performing simultaneous tracking and recognition of hand models under variations in the position, orientation, size and posture of the hand. The experiments demonstrate the feasibility of the approach, and that real time performance can be obtained by pyramid implementations of the proposed concepts.

  • 26.
    Laptev, Ivan
    et al.
    IRISA/INRIA.
    Caputo, Barbara
    Schüldt, Christian
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Local velocity-adapted motion events for spatio-temporal recognition2007Inngår i: Computer Vision and Image Understanding, ISSN 1077-3142, E-ISSN 1090-235X, Vol. 108, nr 3, s. 207-229Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    In this paper, we address the problem of motion recognition using event-based local motion representations. We assume that similar patterns of motion contain similar events with consistent motion across image sequences. Using this assumption, we formulate the problem of motion recognition as a matching of corresponding events in image sequences. To enable the matching, we present and evaluate a set of motion descriptors that exploit the spatial and the temporal coherence of motion measurements between corresponding events in image sequences. As the motion measurements may depend on the relative motion of the camera, we also present a mechanism for local velocity adaptation of events and evaluate its influence when recognizing image sequences subjected to different camera motions. When recognizing motion patterns, we compare the performance of a nearest neighbor (NN) classifier with the performance of a support vector machine (SVM). We also compare event-based motion representations to motion representations in terms of global histograms. A systematic experimental evaluation on a large video database with human actions demonstrates that (i) local spatio-temporal image descriptors can be defined to carry important information of space-time events for subsequent recognition, and that (ii) local velocity adaptation is an important mechanism in situations when the relative motion between the camera and the interesting events in the scene is unknown. The particular advantage of event-based representations and velocity adaptation is further emphasized when recognizing human actions in unconstrained scenes with complex and non-stationary backgrounds.

  • 27.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    A Distance Measure and a Feature Likelihood Map Concept for Scale-Invariant Model Matching2003Inngår i: International Journal of Computer Vision, ISSN 0920-5691, E-ISSN 1573-1405, Vol. 52, nr 2, s. 97-120Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This paper presents two approaches for evaluating multi-scale feature-based object models. Within the first approach, a scale-invariant distance measure is proposed for comparing two image representations in terms of multi-scale features. Based on this measure, the maximisation of the likelihood of parameterised feature models allows for simultaneous model selection and parameter estimation.

    The idea of the second approach is to avoid an explicit feature extraction step and to evaluate models using a function defined directly from the image data. For this purpose, we propose the concept of a feature likelihood map, which is a function normalised to the interval [0, 1], and that approximates the likelihood of image features at all points in scale-space.

    To illustrate the applicability of both methods, we consider the area of hand gesture analysis and show how the proposed evaluation schemes can be integrated within a particle filtering approach for performing simultaneous tracking and recognition of hand models under variations in the position, orientation, size and posture of the hand. The experiments demonstrate the feasibility of the approach, and that real time performance can be obtained by pyramid implementations of the proposed concepts.

  • 28.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner (före 2005), Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner (före 2005), Numerisk analys och datalogi, NADA.
    A multi-scale feature likelihood map for direct evaluation of object hypotheses2001Inngår i: Proc Scale-Space and Morphology in Computer Vision, Springer Berlin/Heidelberg, 2001, Vol. 2106, s. 98-110Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper develops and investigates a new approach for evaluating feature based object hypotheses in a direct way. The idea is to compute a feature likelihood map (FLM), which is a function normalized to the interval [0, 1], and which approximates the likelihood of image features at all points in scale-space. In our case, the FLM is defined from Gaussian derivative operators and in such a way that it assumes its strongest responses near the centers of symmetric blob-like or elongated ridge-like structures and at scales that reflect the size of these structures in the image domain. While the FLM inherits several advantages of feature based image representations, it also (i) avoids the need for explicit search when matching features in object models to image data, and (ii) eliminates the need for thresholds present in most traditional feature based approaches. In an application presented in this paper, the FLM is applied to simultaneous tracking and recognition of hand models based on particle filtering. The experiments demonstrate the feasibility of the approach, and that real time performance can be obtained by a pyramid implementation of the proposed concept.

  • 29.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Interest point detection and scale selection in space-time2003Inngår i: Scale Space Methods in Computer Vision: 4th International Conference, Scale Space 2003 Isle of Skye, UK, June 10–12, 2003 Proceedings, Springer Berlin/Heidelberg, 2003, Vol. 2695, s. 372-387Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Several types of interest point detectors have been proposed for spatial images. This paper investigates how this notion can be generalised to the detection of interesting events in space-time data. Moreover, we develop a mechanism for spatio-temporal scale selection and detect events at scales corresponding to their extent in both space and time. To detect spatio-temporal events, we build on the idea of the Harris and Forstner interest point operators and detect regions in space-time where the image structures have significant local variations in both space and time. In this way, events that correspond to curved space-time structures are emphasised, while structures with locally constant motion are disregarded. To construct this operator, we start from a multi-scale windowed second moment matrix in space-time, and combine the determinant and the trace in a similar way as for the spatial Harris operator. All space-time maxima of this operator are then adapted to characteristic scales by maximising a scale-normalised space-time Laplacian operator over both spatial scales and temporal scales. The motivation for performing temporal scale selection as a complement to previous approaches of spatial scale selection is to be able to robustly capture spatio-temporal events of different temporal extent. It is shown that the resulting approach is truly scale invariant with respect to both spatial scales and temporal scales. The proposed concept is tested on synthetic and real image sequences. It is shown that the operator responds to distinct and stable points in space-time that often correspond to interesting events. The potential applications of the method are discussed.

  • 30.
    Laptev, Ivan
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Local descriptors for spatio-temporal recognition2006Inngår i: Spatial Coherence For Visual Motion Analysis: First International Workshop, SCVMA 2004, Prague, Czech Republic, May 15, 2004. Revised Papers / [ed] MacLean, WJ, Springer Berlin/Heidelberg, 2006, Vol. 3667, s. 91-103Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper presents and investigates a set of local space-time descriptors for representing and recognizing motion patterns in video. Following the idea of local features in the spatial domain, we use the notion of space-time interest points and represent video data in terms of local space-time events. To describe such events, we define several types of image descriptors over local spatio-temporal neighborhoods and evaluate these descriptors in the context of recognizing human activities. In particular, we compare motion representations in terms of spatio-temporal jets, position dependent histograms, position independent histograms, and principal component analysis computed for either spatio-temporal gradients or optic flow. An experimental evaluation on a video database with human actions shows that high classification performance can be achieved, and that there is a clear advantage of using local position dependent histograms, consistent with previously reported findings regarding spatial recognition.

  • 31.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    On Space-Time Interest Points2003Rapport (Annet vitenskapelig)
    Abstract [en]

    Local image features or interest points provide compact and abstract representations of patterns in an image. In this paper, we extend the notion of spatial interest points into the spatio-temporal domain and show how the resulting features capture interesting events in video and can be used for a compact representation and for interpretation of video data.

    To detect spatio-temporal events, we build on the idea of the Harris and Forstner interest point operators and detect local structures in space-time where the image values have significant local variations in both space and time. We estimate the spatio-temporal extents of the detected events by maximizing a normalized spatio-temporal Laplacian operator over spatial and temporal scales. To represent the detected events we then compute local, spatio-temporal, scale-invariant N-jets and classify each event with respect to its jet descriptor. For the problem of human motion analysis, we illustrate how video representation in terms of local space-time features allows for detection of walking people in scenes with occlusions and dynamic cluttered backgrounds.

  • 32.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Space-time interest points2003Inngår i: Proceedings of Ninth IEEE International Conference on Computer Vision, 2003: ICCV'03, IEEE conference proceedings, 2003, s. 432-439Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Local image features or interest points provide compact and abstract representations of patterns in an image. We propose to extend the notion of spatial interest points into the spatio-temporal domain and show how the resulting features often reflect interesting events that can be used for a compact representation of video data as well as for its interpretation. To detect spatio-temporal events, we build on the idea of the Harris and Forstner interest point operators and detect local structures in space-time where the image values have significant local variations in both space and time. We then estimate the spatio-temporal extents of the detected events and compute their scale-invariant spatio-temporal descriptors. Using such descriptors, we classify events and construct video representation in terms of labeled space-time points. For the problem of human motion analysis, we illustrate how the proposed method allows for detection of walking people in scenes with occlusions and dynamic backgrounds.

  • 33.
    Laptev, Ivan
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Tracking of multi-state hand models using particle filtering and a hierarchy of multi-scale image features2001Rapport (Fagfellevurdert)
    Abstract [en]

    This paper presents an approach for simultaneous tracking and recognition of hierarchical object representations in terms of multiscale image features. A scale-invariant dissimilarity measure is proposed for comparing scale-space features at different positions and scales. Based on this measure, the likelihood of hierarchical, parameterized models can be evaluated in such a way that maximization of the measure over different models and their parameters allows for both model selection and parameter estimation. Then, within the framework of particle filtering, we consider the area of hand gesture analysis, and present a method for simultaneous tracking and recognition of hand models under variations in the position, orientation, size and posture of the hand. In this way, qualitative hand states and quantitative hand motions can be captured, and be used for controlling different types of computerised equipment.

  • 34.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Tracking of multi-state hand models using particle filtering and a hierarchy of multi-scale image features2001Inngår i: Scale-Space and Morphology in Computer Vision: Third International Conference, Scale-Space 2001 Vancouver, Canada, July 7–8, 2001 Proceedings, Springer Berlin/Heidelberg, 2001, Vol. 2106, s. 63-74Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper presents an approach for simultaneous tracking and recognition of hierarchical object representations in terms of multiscale image features. A scale-invariant dissimilarity measure is proposed for comparing scale-space features at different positions and scales. Based on this measure, the likelihood of hierarchical, parameterized models can be evaluated in such a way that maximization of the measure over different models and their parameters allows for both model selection and parameter estimation. Then, within the framework of particle filtering, we consider the area of hand gesture analysis, and present a method for simultaneous tracking and recognition of hand models under variations in the position, orientation, size and posture of the hand. In this way, qualitative hand states and quantitative hand motions can be captured, and be used for controlling different types of computerised equipment.

  • 35.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Velocity adaptation of space-time interest points2004Inngår i: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004 / [ed] Kittler, J; Petrou, M; Nixon, M, IEEE conference proceedings, 2004, s. 52-56Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The notion of local features in space-time has recently been proposed to capture and describe local events in video. When computing space-time descriptors, however, the result may strongly depend on the relative motion between the object and the camera. To compensate for this variation, we present a method that automatically adapts the features to the local velocity of the image pattern and, hence, results in a video representation that is stable with respect to different amounts of camera motion. Experimentally we show that the use of velocity adaptation substantially increases the repeatability of interest points as well as the stability of their associated descriptors. Moreover for an application to human action recognition we demonstrate how velocity adapted features enable recognition of human actions in situations with unknown camera motion and complex, nonstationary backgrounds.

  • 36.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Velocity adaptation of spatio-temporal receptive fields for direct recognition of activities: an experimental study2004Inngår i: Image and Vision Computing, ISSN 0262-8856, E-ISSN 1872-8138, Vol. 22, nr 2, s. 105-116Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This article presents an experimental study of the influence of velocity adaptation when recognizing spatio-temporal patterns using a histogram-based statistical framework. The basic idea consists of adapting the shapes of the filter kernels to the local direction of motion, so as to allow the computation of image descriptors that are invariant to the relative motion in the image plane between the camera and the objects or events that are studied. Based on a framework of recursive spatio-temporal scale-space, we first outline how a straightforward mechanism for local velocity adaptation can be expressed. Then, for a test problem of recognizing activities, we present an experimental evaluation, which shows the advantages of using velocity-adapted spatio-temporal receptive fields, compared to directional derivatives or regular partial derivatives for which the filter kernels have not been adapted to the local image motion.

  • 37.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Velocity-adapted spatio-temporal receptive fields for direct recognition of activities2002Inngår i: Proc. ECCV’02 Workshop on Statistical Methods in Video Processing, 2002, s. 61-66Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This article presents an experimental study of the influence of velocity adaptation when recognizing spatio-temporal patterns using a histogram-based statistical framework. The basic idea consists of adapting the shapes of the filter kernels to the local direction of motion, so as to allow the computation of image descriptors that are invariant to the relative motion in the image plane between the camera and the objects or events that are studied. Based on a framework of recursive spatio-temporal scale-space, we first outline how a straightforward mechanism for local velocity adaptation can be expressed. Then, for a test problem of recognizing activities, we present an experimental evaluation, which shows the advantages of using velocity-adapted spatio-temporal receptive fields, compared to directional derivatives or regular partial derivatives for which the filter kernels have not been adapted to the local image motion.

  • 38.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Mayer, H.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Eckstein, W.
    Steger, C.
    Baumgartner, A.
    Automatic extraction of roads from aerial images based on scale space and snakes2000Inngår i: Machine Vision and Applications, ISSN 0932-8092, E-ISSN 1432-1769, Vol. 12, nr 1, s. 23-31Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    We propose a new approach for automatic road extraction from aerial imagery with a model and a strategy mainly based on the multi-scale detection of roads in combination with geometry-constrained edge extraction using snakes. A main advantage of our approach is, that it allows for the first time a bridging of shadows and partially occluded areas using the heavily disturbed evidence in the image. Additionally, it has only few parameters to be adjusted. The road network is constructed after extracting crossings with varying shape and topology. We show the feasibility of the approach not only by presenting reasonable results but also by evaluating them quantitatively based on ground truth.

  • 39.
    Linde, Oskar
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Composed Complex-Cue Histograms: An Investigation of the Information Content in Receptive Field Based Image Descriptors for Object Recognition2012Inngår i: Computer Vision and Image Understanding, ISSN 1077-3142, E-ISSN 1090-235X, Vol. 116, nr 4, s. 538-560Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Recent work has shown that effective methods for recognizing objects and spatio-temporal events can be constructed based on histograms of receptive field like image operations.

    This paper presents the results of an extensive study of the performance of different types of receptive field like image descriptors for histogram-based object recognition, based on different combinations of image cues in terms of Gaussian derivatives or differential invariants applied to either intensity information, colour-opponent channels or both. A rich set of composed complex-cue image descriptors is introduced and evaluated with respect to the problems of (i) recognizing previously seen object instances from previously unseen views, and (ii) classifying previously unseen objects into visual categories.

    It is shown that there exist novel histogram descriptors with significantly better recognition performance compared to previously used histogram features within the same class. Specifically, the experiments show that it is possible to obtain more discriminative features by combining lower-dimensional scale-space features into composed complex-cue histograms. Furthermore, different types of image descriptors have different relative advantages with respect to the problems of object instance recognition vs. object category classification. These conclusions are obtained from extensive experimental evaluations on two mutually independent data sets.

    For the task of recognizing specific object instances, combined histograms of spatial and spatio-chromatic derivatives are highly discriminative, and several image descriptors in terms rotationally invariant (intensity and spatio-chromatic) differential invariants up to order two lead to very high recognition rates.

    For the task of category classification, primary information is contained in both first- and second-order derivatives, where second-order partial derivatives constitute the most discriminative cue.

    Dimensionality reduction by principal component analysis and variance normalization prior to training and recognition can in many cases lead to a significant increase in recognition or classification performance. Surprisingly high recognition rates can even be obtained with binary histograms that reveal the polarity of local scale-space features, and which can be expected to be particularly robust to illumination variations.

    An overall conclusion from this study is that compared to previously used lower-dimensional histograms, the use of composed complex-cue histograms of higher dimensionality reveals the co-variation of multiple cues and enables much better recognition performance, both with regard to the problems of recognizing previously seen objects from novel views and for classifying previously unseen objects into visual categories.

  • 40.
    Linde, Oskar
    et al.
    KTH, Tidigare Institutioner (före 2005), Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner (före 2005), Numerisk analys och datalogi, NADA.
    Object recognition using composed receptive field histograms of higher dimensionality2004Inngår i: Proceedings of the 17th International Conference on Pattern Recognition / [ed] Kittler, J; Petrou, M; Nixon, M, IEEE conference proceedings, 2004, s. 1-6Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Recent work has shown that effective methods for recognising objects or spatio-temporal events can be constructed based on receptive field responses summarised into histograms or other histogram-like image descriptors. This paper presents a set Of composed histogram features of higher dimensionality, which give significantly better recognition performance compared to the histogram descriptors of lower dimensionality that were used in the original papers by Swain & Ballard (1991) or Schiele & Crowley (2000). The use of histograms of higher dimensionality is made possible by a sparse representation for efficient computation and handling of higher-dimensional histograms. Results of extensive experiments are reported, showing how the performance of histogram-based recognition schemes depend upon different combinations of cues, in terms of Gaussian derivatives or differential invariants applied to either intensity information, chromatic information or both. It is shown that there exist composed higher-dimensional histogram descriptors with much better performance for recognising known objects than previously used histogram features. Experiments are also reported of classifying unknown objects into visual categories.

  • 41.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    A computational theory of visual receptive fields2013Inngår i: Biological Cybernetics, ISSN 0340-1200, E-ISSN 1432-0770, Vol. 107, nr 6, s. 589-635Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    A receptive field constitutes a region in the visual field where a visual cell or a visual operator responds to visual stimuli. This paper presents a theory for what types of receptive field profiles can be regarded as natural for an idealized vision system, given a set of structural requirements on the first stages of visual processing that reflect symmetry properties of the surrounding world.

    These symmetry properties include (i) covariance properties under scale changes, affine image deformations, and Galilean transformations of space–time as occur for real-world image data as well as specific requirements of (ii) temporal causality implying that the future cannot be accessed and (iii) a time-recursive updating mechanism of a limited temporal buffer of the past as is necessary for a genuine real-time system. Fundamental structural requirements are also imposed to ensure (iv) mutual consistency and a proper handling of internal representations at different spatial and temporal scales.

    It is shown how a set of families of idealized receptive field profiles can be derived by necessity regarding spatial, spatio-chromatic, and spatio-temporal receptive fields in terms of Gaussian kernels, Gaussian derivatives, or closely related operators. Such image filters have been successfully used as a basis for expressing a large number of visual operations in computer vision, regarding feature detection, feature classification, motion estimation, object recognition, spatio-temporal recognition, and shape estimation. Hence, the associated so-called scale-space theory constitutes a both theoretically well-founded and general framework for expressing visual operations.

    There are very close similarities between receptive field profiles predicted from this scale-space theory and receptive field profiles found by cell recordings in biological vision. Among the family of receptive field profiles derived by necessity from the assumptions, idealized models with very good qualitative agreement are obtained for (i) spatial on-center/off-surround and off-center/on-surround receptive fields in the fovea and the LGN, (ii) simple cells with spatial directional preference in V1, (iii) spatio-chromatic double-opponent neurons in V1, (iv) space–time separable spatio-temporal receptive fields in the LGN and V1, and (v) non-separable space–time tilted receptive fields in V1, all within the same unified theory. In addition, the paper presents a more general framework for relating and interpreting these receptive fields conceptually and possibly predicting new receptive field profiles as well as for pre-wiring covariance under scaling, affine, and Galilean transformations into the representations of visual stimuli.

    This paper describes the basic structure of the necessity results concerning receptive field profiles regarding the mathematical foundation of the theory and outlines how the proposed theory could be used in further studies and modelling of biological vision. It is also shown how receptive field responses can be interpreted physically, as the superposition of relative variations of surface structure and illumination variations, given a logarithmic brightness scale, and how receptive field measurements will be invariant under multiplicative illumination variations and exposure control mechanisms.

  • 42.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    A framework for invariant visual operations based on receptive field responses2013Inngår i: SSVM 2013: Fourth International Conference on Scale Space and Variational Methods in Computer Vision, June 2-6, Schloss Seggau, Graz region, Austria: Invited keynote address / [ed] Arjan Kuijper, 2013Konferansepaper (Annet vitenskapelig)
    Abstract [en]

    The brain is able to maintain a stable perception although the visual stimuli vary substantially on the retina due to geometric transformations and lighting variations in the environment. This talk presents a unified theory for achieving basic invariance properties of visual operations already at the level of receptive fields.

    This generalized framework for invariant receptive field responses comprises:

    • local scaling transformations caused by objects of different size and at different distances to the observer,
    • locally linearized image deformations caused by variations in the viewing direction in relation to the object,
    • locally linearized relative motions between the object and the observer and
    • local multiplicative intensity transformations caused by illumination variations.

    The receptive field model can be derived by necessity from symmetry properties of the environment and leads to predictions about receptive field profiles in good agreement with receptive field profiles measured by cell recordings in mammalian vision. Indeed, the receptive field profiles in the retina, LGN and V1 can be seen as close to ideal to what is motivated by the idealized requirements.

    By complementing receptive field measurements with selection mechanisms over the parameters in the receptive field families, it is shown how true invariance of receptive field responses can be obtained under scaling transformations, affine transformations and Galilean transformations. Thereby, the framework provides a mathematically well-founded and biologically plausible model for how basic invariance properties can be achieved already at the level of receptive fields and support invariant recognition of objects and events under variations in viewpoint, retinal size, object motion and illumination.

    The theory can explain the different shapes of receptive field profiles found in biological vision, which are tuned to different sizes and orientations in the image domain as well as to different image velocities in space-time, from a requirement that the visual system should be invariant to the natural types of image transformations that occur in its environment.

    References:

    • T. Lindeberg (2011) "Generalized Gaussian scale-space axiomatics comprising linear scale-space, affine scale-space and spatio-temporal scale-space". Journal of Mathematical Imaging and Vision, volume 40, number 1, pages 36-81, May 2011.
    • T. Lindeberg (2013) “Invariance of visual operations at the level of receptive fields”, PLoS ONE 8(7): e66990, doi:10.1371/journal.pone.0066990, preprint available from arXiv:1210.0754.
    • T. Lindeberg (2013) "Generalized axiomatic scale-space theory", Advances in Imaging and Electron Physics, (P. Hawkes, ed.), Elsevier, volume 178, pages 1-96, Academic Press: Elsevier Inc., doi: 10.1016/B978-0-12-407701-0.00001-7
  • 43.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    A scale selection principle for estimating image deformations1998Inngår i: Image and Vision Computing, ISSN 0262-8856, E-ISSN 1872-8138, Vol. 16, s. 961-977Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    A basic functionality of a vision system concerns the ability to compute deformation fields between different images of the same physical structure. This article advocates the need for incorporating explicit mechanisms for scale selection in this context, in algorithms for computing descriptors such as optic flow and for performing stereo matching. A basic reason why such a mechanism is essential is the fact that in a coarse-to-fine propagation of disparity or flow information, it is not necessarily the case that the most accurate estimates are obtained at the finest scales. The existence of interfering structures at fine scales may make it impossible to accurately match the image data at fine scales. selecting deformation estimates from the scales that minimize the (suitably normalized) uncertainty over scales. A specific implementation of this idea is presented for a region based differential flow estimation scheme. It is shown that the integrated scale selection and flow estimation algorithm has the qualitative properties of leading to the selection of coarser scales for larger size image structures and increasing noise level, whereas it leads to the selection of finer scales in the neighbourhood of flow field discontinuities

  • 44.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Automatic scale selection as a pre-processing stage for interpreting the visual world1999Inngår i: Proc. Fundamental StructuralProperties in Image and Pattern Analysis FSPIPA'99 , (Budapest, Hungary), September 6-7, 1999, Österreichischen Computer Gesellschaft , 1999, Vol. 130, s. 9-23Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper reviews a systematic methodology for formulating mechanisms for automatic scale selection when performing feature detection in scale-space. An important property of the proposed approach is that the notion of scale is included already in the definition of image features

  • 45.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Automatic Scale Selection as a Pre-Processing Stage to Interpreting Real-World Data1996Inngår i: Proceedings Eighth IEEE International Conference on Tools with Artificial Intelligence (Toulouse, France): Invited keynote address, 1996, s. 490-490Konferansepaper (Annet vitenskapelig)
    Abstract [en]

    e perceive objects in the world as meaningful entities only over certain ranges of scale. A simple example is the concept of a branch of a tree, which makes sense only at a scale from, say, a few centimeters to at most a few meters, It is meaningless to discuss the tree concept at the nanometer or kilometer level. At those scales, it is more relevant to talk about the molecules that form the leaves of the tree, and the forest in which the tree grows, respectively.

    This fact that objects in the world appear in different ways depending on the scale of observation has important implications if one aims at describing them. It shows that the notion of scale is of utmost importance when processing unknown measurement data by automatic methods. In their seminal works, Witkin (1983) and Koenderink (1984) proposed to approach this problem by representing image structures at different scales in a so-called scale-space representation. Traditional scale-space theory building on this work, however, does not address the problem of how to select local appropriate scales for further analysis.

    After a brief review of the main ideas behind a scale-space representation, I will in this talk describe a recently developed systematic methodology for generating hypotheses about interesting scale levels in image data---based on a general principle stating that local extrema over scales of different combinations of normalized derivatives are likely candidates to correspond to interesting image structures. Specifically, it will be shown how this idea can be used for formulating feature detectors which automatically adapt their local scales of processing to the local image structure.

    Support for the proposed methodology will be presented in terms of general study of the scale selection method under rescalings of the input data, as well as more detailed analysis of how the scale selection method performs when integrated with various types of feature detection modules and then applied to characteristic image patterns. Moreover, it will be illustrated by a rich set of experiments how this scale selection approach applies to various types of feature detection problems in early vision.

    In many computer vision applications, the poor performance of the low-level vision modules constitutes a major bottle-neck. It will be argued that the inclusion of mechanisms for automatic scale selection is essential if we are to construct vision systems to analyse complex unknown environments.

  • 46.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Corner detection2001Inngår i: Encyclopaedia of Mathematics / [ed] Michiel Hazewinkel, Springer , 2001Kapittel i bok, del av antologi (Fagfellevurdert)
  • 47.
    Lindeberg, Tony
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Dense scale selection over space, time and space-time2018Inngår i: SIAM Journal on Imaging Sciences, ISSN 1936-4954, E-ISSN 1936-4954, Vol. 11, nr 1, s. 407-441Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Scale selection methods based on local extrema over scale of scale-normalized derivatives have been primarily developed to be applied sparsely---at image points where the magnitude of a scale-normalized differential expression additionally assumes local extrema over the domain where the data are defined. This paper presents a methodology for performing dense scale selection, so that hypotheses about local characteristic scales in images, temporal signals, and video can be computed at every image point and every time moment. A critical problem when designing mechanisms for dense scale selection is that the scale at which scale-normalized differential entities assume local extrema over scale can be strongly dependent on the local order of the locally dominant differential structure. To address this problem, we propose a methodology where local extrema over scale are detected of a quasi quadrature measure involving scale-space derivatives up to order two and propose two independent mechanisms to reduce the phase dependency of the local scale estimates by (i) introducing a second layer of postsmoothing prior to the detection of local extrema over scale, and (ii) performing local phase compensation based on a model of the phase dependency of the local scale estimates depending on the relative strengths between first- and second-order differential structures. This general methodology is applied over three types of domains: (i) spatial images, (ii) temporal signals, and (iii) spatio-temporal video. Experiments demonstrate that the proposed methodology leads to intuitively reasonable results with local scale estimates that reflect variations in the characteristic scales of locally dominant structures over space and time.

  • 48.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Detecting salient blob-like image structures and their scales with a scale-space primal sketch: A method for focus-of-attention1993Inngår i: International Journal of Computer Vision, ISSN 0920-5691, E-ISSN 1573-1405, Vol. 11, nr 3, s. 283-318Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This article presents: (i) a multiscale representation of grey-level shape called the scale-space primal sketch, which makes explicit both features in scale-space and the relations between structures at different scales, (ii) a methodology for extracting significant blob-like image structures from this representation, and (iii) applications to edge detection, histogram analysis, and junction classification demonstrating how the proposed method can be used for guiding later-stage visual processes. The representation gives a qualitative description of image structure, which allows for detection of stable scales and associated regions of interest in a solely bottom-up data-driven way. In other words, it generates coarse segmentation cues, and can hence be seen as preceding further processing, which can then be properly tuned. It is argued that once such information is available, many other processing tasks can become much simpler. Experiments on real imagery demonstrate that the proposed theory gives intuitive results.

  • 49.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Direct estimation of affine image deformations using visual front-end operations with automatic scale selection1995Inngår i: Proc. 5th International Conference on Computer Vision: ICCV'95 (Boston, MA), IEEE Computer Society, 1995, s. 134-141Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This article deals with the problem of estimating deformations of brightness patterns using visual front-end operations. Estimating such deformations constitutes an important subtask in several computer vision problems relating to image correspondence and shape estimation. The following subjects are treated: The problem of decomposing affine flow fields into simpler components is analysed in detail. A canonical parametrization is presented based on singular value decomposition, which naturally separates the rotationally invariant components of the flow field from the rotationally variant ones. A novel mechanism is presented for automatic selection of scale levels when estimating local affine deformations. This mechanism is expressed within a multi-scale framework where disparity estimates are computed in a hierarchical coarse-to-fine manner and corrected using iterative techniques. Then, deformation estimates are selected from the scales that minimize a certain normalized residual over scales. Finally, the descriptors so obtained serve as initial data for computing refined estimates of the local deformations.

  • 50.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Discrete approximations of affine Gaussian receptive fields2017Rapport (Annet vitenskapelig)
    Abstract [en]

    This paper presents a theory for discretizing the affine Gaussian scale-space concept so that scale-space properties hold also for the discrete implementation.

    Two ways of discretizing spatial smoothing with affine Gaussian kernels are presented: (i) by solving semi-discretized affine diffusion equation as derived by necessity from the requirement of a semi-group structure over a continuum of scale parameters as parameterized by a family of spatial covariance matrices and obeying non-creation of new structures from any finer to any coarser scale as formalized by the requirement of non-enhancement of local extrema and (ii) a set of parameterized 3x3-kernels as derived from an additional discretization of the above theory along the scale direction and with the parameters of the kernels having a direct interpretation in terms of the covariance matrix of the composed discrete smoothing operation.

    We show how convolutions with the first family of kernels can be implemented in terms of a closed form expression for the Fourier transform and analyse how a remaining degree of freedom in the theory can be explored to ensure a positive discretization and optionally also achieve higher-order discrete approximation of the angular dependency of the shapes of the affine Gaussian kernels.

    We do also show how discrete directional derivative approximations can be efficiently implemented to approximate affine Gaussian derivatives as constituting a canonical model for receptive fields over a purely spatial image domain and with close relations to receptive fields in biological vision.

123 1 - 50 of 134
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf