Publication | Closed Access
Learning Hierarchical Self-Attention for Video Summarization
60
Citations
7
References
2019
Year
Unknown Venue
Natural Language ProcessingImage AnalysisMachine LearningMachine VisionEngineeringVideo Summarization TaskHierarchical Multi-attention NetworkVideo SummarizationVideo UnderstandingDeep LearningVideo RetrievalHierarchical Self-attentionVideo InterpretationComputer VisionMachine TranslationMulti-modal Summarization
Video summarization still remains a challenging task. Due to sufficient video data on the Internet, such task draws significant attention in the vision community and benefits a wide range of applications, e.g., video retrieval, search, etc. To effectively perform video summarization by deriving the keyframes which represent the given input video, we propose a novel framework named Hierarchical Multi-Attention Network (H-MAN) which comprises the shot-level reconstruction model and multi-head attention model. While our designed attention model is two-stage hierarchical structure for producing various attention maps, we are among the first to utilize the multi-attention mechanism in the video summarization task, which brings improved performance. The quantitative and qualitative results demonstrate the effectiveness of our model, which performs favorably against state-of-the-art approaches.
| Year | Citations | |
|---|---|---|
Page 1
Page 1