Publication | Closed Access
RoboSeg: Real-Time Semantic Segmentation on Computationally Constrained Robots
22
Citations
17
References
2020
Year
Convolutional Neural NetworkEngineeringMachine LearningReal-time Semantic SegmentationImage AnalysisSemantic SegmentationRobot LearningVideo TransformerRobotics PerceptionGeometric ModelingMachine VisionObject DetectionComputer ScienceReal-time Segmentation ModelDeep LearningComputer VisionScene UnderstandingHigh-performance SegmentationRoboticsScene ModelingImage SegmentationSpatial Information
Real-time and high-performance segmentation is a crucial but challenging perception task for computationally constrained robots, such as the humanoid NAO robot used in the RoboCup Soccer Standard Platform League. However, most existing convolutional neural network (CNN)-based models for semantic segmentation suffer from massive computational costs, which prevents them from being applied to performing real-time inference with a NAO. In this article, we first publish meticulously annotated datasets for training and evaluating semantic segmentation models. Then, we propose a fast downsampling module that downsamples the image while maintaining the spatial information and a novel dense learning module that learns high-level semantic information while recovering the spatial details. Based on these operations, by using a multiscale fusion method to recover the resolution, we propose a more efficient and real-time segmentation model called RoboSeg primarily aimed at offering better speed and accuracy tradeoffs. Finally, to accommodate practical engineering applications, we offer a promising deployment guideline for the CNN model describing how to deploy it on computational resource-limited robots and achieve real-time performance. The experimental results show that the RoboSeg exceeds the state-of-the-art networks in RoboCup scene segmentation: we attain a mean IoU of 87.35% and a pixel accuracy of 96.88% on our dataset using a model that contains only 0.29M parameters and performs just 0.73 GFLOPs. Under the proposed deployment strategies, the network can run at above 30 FPS on NAO robots with downsampled frames.
| Year | Citations | |
|---|---|---|
Page 1
Page 1