Publication | Closed Access
Integration of multimodal features for video scene classification based on HMM
97
Citations
6
References
1999
Year
Unknown Venue
EngineeringMachine LearningProduct HmmMultimedia AnalysisVideo RetrievalVideo InterpretationSpeech RecognitionImage AnalysisInformation RetrievalData SciencePattern RecognitionHidden Markov ModelVideo Content AnalysisMachine VisionDigital VideoAudio RetrievalComputer ScienceVideo UnderstandingDeep LearningVideo Scene ClassificationComputer VisionMultimodal FeaturesAudio MiningSpeech Processing
Along with the advances in multimedia and Internet technology, a huge amount of data, including digital video and audio, are generated daily. Tools for the efficient indexing and retrieval of such data are indispensable. With multi-modal information present in the data, effective integration is necessary and is still a challenging problem. In this paper, we present four different methods for integrating audio and visual information for video classification based on a hidden Markov model (HMM): direct concatenation, product HMM, two-stage HMM, and integration by neural network. Our results have shown significant improvements over using a single modality.
| Year | Citations | |
|---|---|---|
Page 1
Page 1