Minimum prediction residual principle applied to speech recognition

TLDR

The system recognizes isolated spoken words by computing a minimum prediction residual. The method stores each word as a time pattern of LPCs, aligns them to the input autocorrelation using dynamic programming to minimize the log prediction residual, applies a sequential decision rule and frequency normalization, and is implemented on a DDP‑516 for a 200‑word test. The system achieved a 97.3 % recognition rate for a male talker over telephone input, with processing about 22 × real time.

Abstract

A computer system is described in which isolated words, spoken by a designated talker, are recognized through calculation of a minimum prediction residual. A reference pattern for each word to be recognized is stored as a time pattern of linear prediction coefficients (LPC). The total log prediction residual of an input signal is minimized by optimally registering the reference LPC onto the input autocorrelation coefficients using the dynamic programming algorithm (DP). The input signal is recognized as the reference word which produces the minimum prediction residual. A sequential decision procedure is used to reduce the amount of computation in DP. A frequency normalization with respect to the long-time spectral distribution is used to reduce effects of variations in the frequency response of telephone connections. The system has been implemented on a DDP-516 computer for the 200-word recognition experiment. The recognition rate for a designated male talker is 97.3 percent for telephone input, and the recognition time is about 22 times real time.

References

Page 1

	Year	Citations

Page 1