Input-Cell Attention Reduces Vanishing Saliency of Recurrent Neural\n Networks

Abstract

Recent efforts to improve the interpretability of deep neural networks use\nsaliency to characterize the importance of input features to predictions made\nby models. Work on interpretability using saliency-based methods on Recurrent\nNeural Networks (RNNs) has mostly targeted language tasks, and their\napplicability to time series data is less understood. In this work we analyze\nsaliency-based methods for RNNs, both classical and gated cell architectures.\nWe show that RNN saliency vanishes over time, biasing detection of salient\nfeatures only to later time steps and are, therefore, incapable of reliably\ndetecting important features at arbitrary time intervals. To address this\nvanishing saliency problem, we propose a novel RNN cell structure (input-cell\nattention), which can extend any RNN cell architecture. At each time step,\ninstead of only looking at the current input vector, input-cell attention uses\na fixed-size matrix embedding, each row of the matrix attending to different\ninputs from current or previous time steps. Using synthetic data, we show that\nthe saliency map produced by the input-cell attention RNN is able to faithfully\ndetect important features regardless of their occurrence in time. We also apply\nthe input-cell attention RNN on a neuroscience task analyzing functional\nMagnetic Resonance Imaging (fMRI) data for human subjects performing a variety\nof tasks. In this case, we use saliency to characterize brain regions (input\nfeatures) for which activity is important to distinguish between tasks. We show\nthat standard RNN architectures are only capable of detecting important brain\nregions in the last few time steps of the fMRI data, while the input-cell\nattention model is able to detect important brain region activity across time\nwithout latter time step biases.\n

References

Page 1

	Year	Citations

Page 1