Publication | Open Access
Recurrent Models of Visual Attention
2.3K
Citations
19
References
2014
Year
Applying convolutional neural networks to large images is computationally ex-pensive because the amount of computation scales linearly with the number of image pixels. We present a novel recurrent neural network model that is ca-pable of extracting information from an image or video by adaptively selecting a sequence of regions or locations and only processing the selected regions at high resolution. Like convolutional neural networks, the proposed model has a degree of translation invariance built-in, but the amount of computation it per-forms can be controlled independently of the input image size. While the model is non-differentiable, it can be trained using reinforcement learning methods to learn task-specific policies. We evaluate our model on several image classification tasks, where it significantly outperforms a convolutional neural network baseline on cluttered images, and on a dynamic visual control problem, where it learns to track a simple object without an explicit training signal for doing so. 1
| Year | Citations | |
|---|---|---|
1997 | 93.8K | |
2005 | 18.1K | |
1998 | 11.2K | |
1992 | 7.4K | |
1999 | 5K | |
2006 | 1.6K | |
2013 | 1.3K | |
2000 | 1.1K | |
2010 | 930 | |
2010 | 803 |
Page 1
Page 1