Modeling the relationship between acoustic stimulus and EEG with a dilated convolutional neural network

Abstract

Current tests to measure whether a person can understand speech require behavioral responses from the person, which is in practice not always possible (e.g. young children). Therefore there is a need for objective measures of speech intelligibility. Recently, it has been shown that speech intelligibility can be measured by letting a person listen to natural speech, recording the electroencephalogram (EEG) and decoding the speech envelope from the EEG signal. Linear decoders are used, which is sub-optimal, as the human brain is a complex non-linear system and cannot easily be modeled by a linear decoder. We therefore propose an approach based on deep learning which can model complex non-linear relationships. Our approach is based on dilated convolutions as used in WaveNet to maximize the receptive field with regard to the number of tunable parameters. Comparison with a model based on a state of the art linear decoder and a convolutional baseline model shows that our proposed model significantly improves on both models (from 62.3% to 90.6% (p<; 0.001) and from 78.8% to 90.6% (p<; 0.001) respectively). Best results are achieved with a receptive field size between 250-500ms, which is longer than the optimal integration window for a linear decoder.

References

Page 1

	Year	Citations

Page 1