Concepedia

TLDR

Object detection has recently achieved notable success yet remains difficult for computers to accurately and swiftly identify multiple objects, unlike humans. The paper proposes a modified YOLOv1 neural network for object detection. The model modifies YOLOv1 by replacing the margin‑style loss with a proportion‑style loss, adding a spatial pyramid pooling layer, and incorporating a 1×1 inception module to reduce parameters. Experiments on Pascal VOC 2007/2012 demonstrate that the modified network achieves better performance, with the new loss function being more flexible and effective.

Abstract

In the field of object detection, recently, tremendous success is achieved, but still it is a very challenging task to detect and identify objects accurately with fast speed. Human beings can detect and recognize multiple objects in images or videos with ease regardless of the object’s appearance, but for computers it is challenging to identify and distinguish between things. In this paper, a modified YOLOv1 based neural network is proposed for object detection. The new neural network model has been improved in the following ways. Firstly, modification is made to the loss function of the YOLOv1 network. The improved model replaces the margin style with proportion style. Compared to the old loss function, the new is more flexible and more reasonable in optimizing the network error. Secondly, a spatial pyramid pooling layer is added; thirdly, an inception model with a convolution kernel of 1 <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M1"><mml:mi>∗</mml:mi></mml:math> 1 is added, which reduced the number of weight parameters of the layers. Extensive experiments on Pascal VOC datasets 2007/2012 showed that the proposed method achieved better performance.

References

YearCitations

Page 1