Publication | Closed Access
Movie segmentation into scenes and chapters using locally weighted bag of visual words
39
Citations
12
References
2009
Year
Unknown Venue
Scene AnalysisEngineeringVideo SummarizationVideo RetrievalNatural Language ProcessingImage AnalysisPattern RecognitionText RecognitionText SegmentationMovies SegmentationMachine VisionVideo UnderstandingDeep LearningComputer VisionVisual WordsSemantic GapScene InterpretationScene UnderstandingMovie Segmentation
Movies segmentation into semantically correlated units is a quite tedious task due to "semantic gap". Low-level features do not provide useful information about the semantical correlation between shots and usually fail to detect scenes with constantly dynamic content. In the method we propose herein, local invariant descriptors are used to represent the key-frames of video shots and a visual vocabulary is created from these descriptors resulting to a visual words histogram representation (bag of visual words) for each shot. A key aspect of our method is that, based on an idea from text segmentation, the histograms of visual words corresponding to each shot are further smoothed temporally by taking into account the histograms of neighboring shots. In this way, valuable contextual information is preserved. The final scene and chapter boundaries are determined at the local maxima of the difference of successive smoothed histograms for low and high values of the smoothing parameter respectively. Numerical experiments indicate that our method provides high detection rates while preserving a good tradeoff between recall and precision.
| Year | Citations | |
|---|---|---|
Page 1
Page 1