Due to the huge variability of image information under natural image transformations, the receptive field responses of the local image operations that serve as input to higher level visual processes will in general be strongly dependent on the geometric and illumination conditions in the image formation process. To obtain robustness of a vision system, it is natural to require the receptive field families underlying the image operators to be either invariant or covariant under the relevant families of natural image transformations.
This talk presents an improved model and theory for time-causal and time-recursive spatio-temporal receptive fields, obtained by a combination of Gaussian receptive fields over the spatial domain and first-order integrators or equivalently truncated exponential filters coupled in cascade over the temporal domain. This model inherits the theoretically attractive properties of the Gaussian scale-space model over a spatial domain in terms of (i) invariance or covariance of receptive field responses under scaling transformation and affine transformations over the spatial domain combined with (ii) non-creation of new image structures from finer to coarser scales. When complemented by velocity adaptation the receptive field responses can be made (iii) Galilean covariant or invariant to account for unknown or variable relative motions between objects in the world and the observer. Additionally when expressed over a logarithmic distribution of the temporal scale levels, this model allows for (iv) scale invariance and self-similarity over the temporal domain while simultaneously expressed over a time-causal and time-recursive temporal domain, which is a theoretically new type of construction.
We propose this axiomatically derived theory as the natural extension of the Gaussian scale-space paradigm for local image operations from a spatial domain to a time-causal spatio-temporal domain, to be used as a general framework for expressing spatial and spatio-temporal image operators for a computer vision system. The theory leads to (v) predictions about spatial and spatio-temporal receptive fields with good qualitative similarity to biological receptive fields measured by cell recordings in the retina, the lateral geniculate nucleus (LGN) and the primary visual cortex (V1). Specifically, this framework allows for (vi) computationally efficient real-time operations and leads to (vii) much better temporal dynamics (shorter temporal delays) compared to previously formulated time-causal temporal scale-space models.
Lindeberg (2016) "Time-causal and time-recursive spatio-temporal receptive fields", Journal of Mathematical Imaging and Vision, 55(1): 50-88.
First European Machine Vision Forum, Heidelberg, Germany, September 8-9, 2016.