Publication | Closed Access
Wordspotter training using figure-of-merit back propagation
12
Citations
6
References
2002
Year
Unknown Venue
EngineeringMachine LearningSpoken Language ProcessingCorpus LinguisticsFigure-of-merit Back PropagationText MiningLanguage ProcessingSpeech RecognitionNatural Language ProcessingAd Hoc ThresholdsWord EmbeddingsData ScienceComputational LinguisticsRobust Speech RecognitionAutomatic RecognitionLanguage StudiesSupervised LearningMachine TranslationFom GradientComputer ScienceSpeech CommunicationSpeech AnalysisViterbi AlignmentSpeech ProcessingSpeech InputLinguistics
A new approach to wordspotter training is presented which directly maximizes the figure of merit (FOM) defined as the average detection rate over a specified range of false alarm rates. This systematic approach to discriminant training for wordspotters eliminates the necessity of ad hoc thresholds and tuning. It improves the FOM of wordspotters tested using cross-validation on the credit-card speech corpus training conversations by 4 to 5 percentage points to roughly 70%. This improved performance requires little extra complexity during wordspotting and only two extra passes through the training data during training. The FOM gradient is computed analytically for each putative hit, back-propagated through HMM word models using the Viterbi alignment, and used to adjust RBF hidden node centers and state-weights associated with every node in HMM keyword models.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
| Year | Citations | |
|---|---|---|
Page 1
Page 1