Concepedia

Publication | Closed Access

Neural Network Based Pitch Tracking in Very Noisy Speech

93

Citations

37

References

2014

Year

Abstract

Pitch determination is a fundamental problem in speech processing, which has been studied for decades. However, it is challenging to determinate pitch in strong noise because the harmonic structure is corrupted. In this paper, we estimate pitch using supervised learning, where the probabilistic pitch states are directly learned from noisy speech data. We investigate two alternative neural networks modeling pitch state distribution given observations. The first one is a feedforward deep neural network (DNN), which is trained on static frame-level acoustic features. The second one is a recurrent deep neural network (RNN) which is trained on sequential frame-level features and capable of learning temporal dynamics. Both DNNs and RNNs produce accurate probabilistic outputs of pitch states, which are then connected into pitch contours by Viterbi decoding. Our systematic evaluation shows that the proposed pitch tracking algorithms are robust to different noise conditions and can even be applied to reverberant speech. The proposed approach also significantly outperforms other state-of-the-art pitch tracking algorithms.

References

YearCitations

Page 1