Publication | Closed Access
Making You Only Look Once Faster: Toward Real-Time Intelligent Transportation Detection
14
Citations
35
References
2022
Year
Automotive TrackingAlbany DetectionEngineeringMachine LearningAdvanced Driver-assistance SystemAutonomous SystemsIntelligent SystemsIntelligent Traffic ManagementImage AnalysisFocus ModuleObject TrackingTransportation EngineeringMachine VisionObject DetectionComputer EngineeringMoving Object TrackingComputer ScienceTraffic MonitoringComputer VisionEye TrackingHonor V20Transportation Systems
We present in this article a simple yet efficient algorithm named <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">you only look once: dynamic and stem</i> (YOLO-DS), which can better complete real-time intelligent transportation detection. YOLO-DS is accomplished based on YOLOv5s through the following primary modifications. First, we apply a dynamic mechanism in the backbone. The dynamic mechanism enables the model to be a multibranch model during training and a single-path model during inference and deployment. Hence, our model can enjoy the benefits of fast speed and economical memory while maintaining excellent performance. Second, the original focus module in the backbone is inefficient. Accordingly, we employ the stem module (built entirely with standard convolution) instead of the focus module in the backbone. Finally, we modify the width and depth of the model and redesign a new, lighter detection head to improve the model further. Compared to the original YOLOv5s, YOLO-DS improves the mean average precision (AP) by 2.7 and 0.2 on the University at Albany DETection and tRACking and the high-speed train fault dataset. In addition, experiments conducted on various devices show that the speed of YOLO-DS is highly impressive, far exceeding previous lightweight neural networks. Specifically, the inference speed of YOLO-DS is twice that of the original YOLOv5s, up to 7.98 times. Moreover, our YOLO-DS-Tiny processes an image with 640 × 640 resolution averaging barely 35 ms on a cellphone (the Honor V20).
| Year | Citations | |
|---|---|---|
Page 1
Page 1