Publication | Open Access
Construction of a Test Collection for Spoken Document Retrieval from Lecture Audio Data
15
Citations
12
References
2009
Year
Spoken Document RetrievalEngineeringSpeech CorpusCorpus LinguisticsSpeech RecognitionNatural Language ProcessingLanguage DocumentationInformation RetrievalLanguage TestingSpoken Lecture RetrievalLanguage StudiesLinguisticsRetrieval QueriesAudio RetrievalSpontaneous JapaneseSpeech CommunicationSpeech AnalysisAudio MiningSpeech ProcessingSpeech PerceptionTest CollectionLecture Audio DataSpeaker Recognition
Lectures are valuable audiovisual data, yet evaluating spoken document processing is difficult because it requires subjective judgments and large evaluation datasets. The study reports a test collection designed to evaluate spoken lecture retrieval. The collection comprises 2,700 lectures (604 h) from the Corpus of Spontaneous Japanese, 39 queries, annotated relevant passages, and automatic transcriptions. Baseline retrieval performance was obtained using a standard spoken document retrieval method, establishing a benchmark for future research.
The lecture is one of the most valuable genres of audiovisual data. Though spoken document processing is a promising technology for utilizing the lecture in various ways, it is difficult to evaluate because the evaluation require a subjective judgment and/or the verification of large quantities of evaluation data. In this paper, a test collection for the evaluation of spoken lecture retrieval is reported. The test collection consists of the target spoken documents of about 2, 700 lectures (604 hours) taken from the Corpus of Spontaneous Japanese (CSJ), 39 retrieval queries, the relevant passages in the target documents for each query, and the automatic transcription of the target speech data. This paper also reports the retrieval performance targeting the constructed test collection by applying a standard spoken document retrieval (SDR) method, which serves as a baseline for the forthcoming SDR studies using the test collection.
| Year | Citations | |
|---|---|---|
Page 1
Page 1