Concepedia

Publication | Open Access

Saliency estimation using a non-parametric low-level vision model

348

Citations

16

References

2011

Year

TLDR

Attention prediction models typically use convolution, center‑surround, and spatial pooling, yet integrating spatial information and justifying parameter choices remain unresolved. The study aims to extend a principled color‑appearance model to saliency estimation, demonstrating superior performance over state‑of‑the‑art models. The model integrates scales via an inverse wavelet transform of scale‑weighted center‑surround responses, with the ECSF weighting optimized against psychophysical data and inhibition window sizes learned from eye‑fixation data using a Gaussian Mixture Model. The extended model outperforms existing saliency methods and supports the hypothesis of a shared low‑level visual front‑end for diverse visual tasks.

Abstract

Many successful models for predicting attention in a scene involve three main steps: convolution with a set of filters, a center-surround mechanism and spatial pooling to construct a saliency map. However, integrating spatial information and justifying the choice of various parameter values remain open problems. In this paper we show that an efficient model of color appearance in human vision, which contains a principled selection of parameters as well as an innate spatial pooling mechanism, can be generalized to obtain a saliency model that outperforms state-of-the-art models. Scale integration is achieved by an inverse wavelet transform over the set of scale-weighted center-surround responses. The scale-weighting function (termed ECSF) has been optimized to better replicate psychophysical data on color appearance, and the appropriate sizes of the center-surround inhibition windows have been determined by training a Gaussian Mixture Model on eye-fixation data, thus avoiding ad-hoc parameter selection. Additionally, we conclude that the extension of a color appearance model to saliency estimation adds to the evidence for a common low-level visual front-end for different visual tasks.

References

YearCitations

Page 1