Publication | Open Access
BaNa: A Noise Resilient Fundamental Frequency Detection Algorithm for Speech and Music
27
Citations
39
References
2014
Year
MusicFundamental FrequencyEngineeringSpectrum EstimationAcoustic ModelingSpeech RecognitionOcean AcousticsAudio Signal ProcessingAudio AnalysisNoiseAcoustical EngineeringAcoustic Signal ProcessingAcoustic AnalysisSpeech Signal AnalysisHealth SciencesAcoustic MethodsComputer EngineeringAcoustic PropagationAudio RetrievalComputer ScienceBana AlgorithmSignal ProcessingSpeech AcousticsDb Snr ScenariosSpeech ProcessingComputational Acoustics
Fundamental frequency (F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> ) is one of the essential features in many acoustic related applications. Although numerous F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> detection algorithms have been developed, the detection accuracy in noisy environments still needs improvement. We present a hybrid noise resilient F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> detection algorithm named BaNa that combines the approaches of harmonic ratios and Cepstrum analysis. A Viterbi algorithm with a cost function is used to identify the F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> value among several F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> candidates. Speech and music databases with eight different types of additive noise are used to evaluate the performance of the BaNa algorithm and several classic and state-of-the-art F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> detection algorithms. Results show that for almost all types of noise and signal-to-noise ratio (SNR) values investigated, BaNa achieves the lowest Gross Pitch Error (GPE) rate among all the algorithms. Moreover, for the 0 dB SNR scenarios, the BaNa algorithm is shown to achieve 20% to 35% GPE rate for speech and 12% to 39% GPE rate for music. We also describe implementation issues that must be addressed to run the BaNa algorithm as a real-time application on a smartphone platform.
| Year | Citations | |
|---|---|---|
Page 1
Page 1