Automatic Correction of Stutter in Disfluent Speech

Abstract

This paper proposes an automatic correction of stutter involving repetitions, prolongations and long pauses in disfluent speech using signal processing techniques. Mel Frequency Cepstral Coefficients (MFCC) and Linear Predictive Coefficients (LPC) are used to extract the features. Short time energy and correlation between frames are the parameters considered for the removal of repetitions and prolongations, respectively. For long pauses, the input speech samples are rate converted to a sampling rate of 22.05 kHz and long pauses (samples) are removed, retaining the natural pause between words. There is limited work reported on automatic stutter correction using signal processing methods and work on correcting the three types of stutters simultaneously has not been reported. An accuracy of 88.35%, 94.3% and 97.5% is obtained for repetitions, prolongations, and long pauses respectively, with average time for correction being 2 seconds on an Intel 8th gen i5 system, making it suitable for time-critical applications.

References

Page 1

	Year	Citations

Page 1