Publication | Closed Access
Temporal Deformable Residual Networks for Action Segmentation in Videos
198
Citations
46
References
2018
Year
Unknown Venue
EngineeringMachine LearningResidual StreamVideo InterpretationImage AnalysisPattern RecognitionVideo Content AnalysisRobot LearningTemporal SegmentationVideo TransformerHuman ActionsAction SegmentationMachine VisionDanceVideo UnderstandingDeep LearningComputer VisionVideo AnalysisEye TrackingVideo Hallucination
This paper is about temporal segmentation of human actions in videos. We introduce a new model - temporal deformable residual network (TDRN) - aimed at analyzing video intervals at multiple temporal scales for labeling video frames. Our TDRN computes two parallel temporal streams: i) Residual stream that analyzes video information at its full temporal resolution, and ii) Pooling/unpooling stream that captures long-range video information at different scales. The former facilitates local, fine-scale action segmentation, and the latter uses multiscale context for improving accuracy of frame classification. These two streams are computed by a set of temporal residual modules with deformable convolutions, and fused by temporal residuals at the full video resolution. Our evaluation on the University of Dundee 50 Salads, Georgia Tech Egocentric Activities, and JHU-ISI Gesture and Skill Assessment Working Set demonstrates that TDRN outperforms the state of the art in frame-wise segmentation accuracy, segmental edit score, and segmental overlap F1 score.
| Year | Citations | |
|---|---|---|
Page 1
Page 1