Publication | Open Access
TPH-YOLOv5++: Boosting Object Detection on Drone-Captured Scenarios with Cross-Layer Asymmetric Transformer
105
Citations
44
References
2023
Year
Convolutional Neural NetworkEngineeringMachine LearningTransformer Prediction HeadsField RoboticsBoosting Object DetectionCross-layer Asymmetric TransformerImage AnalysisPattern RecognitionDrone-captured ImagesObject TrackingRobot LearningVideo TransformerVision RecognitionMachine VisionAutomatic Target RecognitionObject DetectionComputer EngineeringComputer ScienceDeep LearningComputer VisionDrone-captured ScenariosObject Recognition
Object detection in drone-captured images is a popular task in recent years. As drones always navigate at different altitudes, the object scale varies considerably, which burdens the optimization of models. Moreover, high-speed and low-altitude flight cause motion blur on densely packed objects, which leads to great challenges. To solve the two issues mentioned above, based on YOLOv5, we add an additional prediction head to detect tiny-scale objects and replace CNN-based prediction heads with transformer prediction heads (TPH), constructing the TPH-YOLOv5 model. TPH-YOLOv5++ is proposed to significantly reduce the computational cost and improve the detection speed of TPH-YOLOv5. In TPH-YOLOv5++, cross-layer asymmetric transformer (CA-Trans) is designed to replace the additional prediction head while maintain the knowledge of this head. By using a sparse local attention (SLA) module, the asymmetric information between the additional head and other heads can be captured efficiently, enriching the features of other heads. In the VisDrone Challenge 2021, TPH-YOLOv5 won 4th place and achieved well-matched results with the 1st place model (AP 39.43%). Based on the TPH-YOLOv5 and CA-Trans module, TPH-YOLOv5++ can further increase efficiency while achieving comparable and better results.
| Year | Citations | |
|---|---|---|
Page 1
Page 1