HMM clustering for connected word recognition

Abstract

The authors describe an HMM (hidden Markov model) clustering procedure and discuss its application to connected-word systems and to large-vocabulary recognition based on phonelike units. It is shown that the conventional approach of maximizing likelihood is easily implemented but does not work well in practice, as it tends to give improved models of tokens for which the initial model was generally quite good, but does not improve tokens which are poorly represented by the initial model. The authors have developed a splitting procedure which initializes each new cluster (statistical model) by splitting off all tokens in the training set which were poorly represented by the current set of models. This procedure is highly efficient and gives excellent recognition performance in connected-word tasks. In particular, for speaker-independent connected-digit recognition, using two HMM-clustered models, the recognition performance is as good as or better than previous results using 4-6 models/digit obtained from template-based clustering.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>

References

Page 1

	Year	Citations

Page 1