CENet: A Channel-Enhanced Spatiotemporal Network With Sufficient Supervision Information for Recognizing Industrial Smoke Emissions

Abstract

Vision-based industrial smoke emission recognition technology can identify smoke emissions and provide a visual evidence for humans to pursue environmental justice. However, the existing methods still face the issues of low detection rates (DRs) and high false alarm rates (FARs) due to the insufficient supervision information and limited feature representation capability. To solve these issues, this article presents a channel-enhanced spatiotemporal network (CENet) with sufficient supervision information for recognizing industrial smoke emissions. First, to provide sufficient supervision information for learning discriminative feature representation, we propose a new loss function by collaboratively taking the binary category, pixel-level smoke density, and background information as supervision information and use them in the network final layer as well as the network middle layers to guide model training. Second, to solve the deficiencies of max/average pooling and convolution operations in feature extraction, we propose channel-enhanced modules, including channel-enhanced pooling (CEPool), channel-enhanced convolution (CEConv), and channel-enhanced upsampling (CEUpsample) to learn high-response values of bright features as well as the low-response values with discriminative features. The channel-enhanced modules selectively enhance the learned features with large amount of information and suppress those useless features. Third, we propose a two-stage method based on spatiotemporal information extraction module (SIEM) and smoke recognition module (SRM), which are designed to learn spatiotemporal information between input frames and that between smoke density frames, respectively. Extensive experiments show that CENet achieves the best performance among the existing smoke recognition methods.

References

Page 1

	Year	Citations

Page 1