Publication | Open Access
Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques
821
Citations
4
References
2010
Year
EngineeringBiometricsFeature ExtractionDynamic Time WarpingSpeech RecognitionSpeech CodingPattern RecognitionPhoneticsRobust Speech RecognitionVoice RecognitionSpeech Signal AnalysisHealth SciencesDistant Speech RecognitionSpeech SignalSignal ProcessingSpeech CommunicationSpeech TechnologyVoiceSpeech ProcessingSpeech InputVoice Recognition AlgorithmsSpeech PerceptionArtificial Neural NetworkSpeaker Recognition
Digital speech processing and voice recognition are critical for fast, accurate automatic systems, yet the voice signal contains vast information that makes direct analysis challenging. The study evaluates feature extraction and matching techniques, focusing on MFCC for feature extraction and DTW for alignment, to identify a straightforward, effective method for voice recognition. After preprocessing, the authors extract features with Mel Frequency Cepstral Coefficients and match them using Dynamic Time Warping, comparing this approach against other models such as LPC, HMM, and ANN.
Digital processing of speech signal and voice recognition algorithm is very important for fast and accurate automatic voice recognition technology. The voice is a signal of infinite information. A direct analysis and synthesizing the complex voice signal is due to too much information contained in the signal. Therefore the digital signal processes such as Feature Extraction and Feature Matching are introduced to represent the voice signal. Several methods such as Liner Predictive Predictive Coding (LPC), Hidden Markov Model (HMM), Artificial Neural Network (ANN) and etc are evaluated with a view to identify a straight forward and effective method for voice signal. The extraction and matching process is implemented right after the Pre Processing or filtering signal is performed. The non-parametric method for modelling the human auditory perception system, Mel Frequency Cepstral Coefficients (MFCCs) are utilize as extraction techniques. The non linear sequence alignment known as Dynamic Time Warping (DTW) introduced by Sakoe Chiba has been used as features matching techniques. Since it's obvious that the voice signal tends to have different temporal rate, the alignment is important to produce the better performance.This paper present the viability of MFCC to extract features and DTW to compare the test patterns.
| Year | Citations | |
|---|---|---|
Page 1
Page 1