Publication | Closed Access
Using n-best recognition output for extractive summarization and keyword extraction in meeting speech
25
Citations
21
References
2010
Year
Unknown Venue
EngineeringSpeech CorpusEntity SummarizationCorpus LinguisticsText MiningAutomatic SummarizationSpeech RecognitionNatural Language ProcessingInformation RetrievalData ScienceRecognition Confidence MeasureComputational LinguisticsTopic SegmentationLanguage StudiesN-best Recognition OutputMachine TranslationMeeting SpeechNlp TaskSpeech CommunicationSpeech AnalysisMulti-modal SummarizationMulti-speaker Speech RecognitionKeyword ExtractionSpeech ProcessingSpeech PerceptionLinguistics
There has been increasing interest recently in meeting understanding, such as summarization, browsing, action item detection, and topic segmentation. However, there is very limited effort on using rich recognition output (e.g., recognition confidence measure or more recognition candidates) for these downstream tasks. This paper presents an initial study using n-best recognition hypotheses for two tasks, extractive summarization and keyword extraction. We extend the approach used on 1-best output to n-best hypotheses: MMR (maximum marginal relevance) for summarization and TFIDF (term frequency, inverse document frequency) weighting for keyword extraction. Our experiments on the ICSI meeting corpus demonstrate promising improvement using n-best hypotheses over 1-best output. These results suggest worthy future studies using n-best or lattices as the interface between speech recognition and downstream tasks.
| Year | Citations | |
|---|---|---|
Page 1
Page 1