Publication | Closed Access
Word-dependent acoustic-labial weights in HMM-based speech recognition.
17
Citations
0
References
1997
Year
Unknown Venue
This paper describes a novel approach for weighting the contribution of the acoustic and visual sources of information in a bimodal connected speech recognition system. We consider that a different acousticlabial weight is attached to each recognition unit. The values of the weighting vector are optimised in order to minimise error rate on a learning set. Experiments are performed on a two-speakers audio-visual database, composed of connected letters, with two different acoustic-labial speech recognition systems. For both speakers and both systems, the weights optimisation allows us to increase the recognition rate of our bimodal system. 1 INTRODUCTION In normal conditions, the acoustic signal contains more information about the oral message or the speaker's identity than the visual information about the lips. Nevertheless, these two sources of information are not redundant : taking labial features into account may lead to an improvement of speech processing systems [8, 7, 3, 15, 9]....