Concepedia

Abstract

In remote sensing scene classification (RSSC), features can be extracted with different spatial frequencies where high-frequency features usually represent detailed information and low-frequency features usually represent global structures. However, it is challenging to extract meaningful semantic information for RSSC tasks by just utilizing high- or low-frequency features. The spatial composition of remote sensing images (RSIs) is more complex than that of natural images, and the scales of objects vary significantly. In this article, a multiscale feature fusion covariance network (MF<sup>2</sup>CNet) with octave convolution (Oct Conv) is proposed, which can extract multifrequency and multiscale features from RSIs. First, the multifrequency feature extraction (MFE) module is used to obtain fine-grained frequency features by Oct Conv. Then, the features of different layers in MF<sup>2</sup>CNet are fused by the multiscale feature fusion (MF<sup>2</sup>) module. Finally, instead of using global average pooling (GAP), global covariance pooling (GCP) extracts high-order information from RSIs to capture richer statistics of deep features. In the proposed MF<sup>2</sup>CNet, the obtained multifrequency and multiscale features can effectively improve the performance of CNNs. Experimental results on four public RSI datasets show that MF<sup>2</sup>CNet has advantages in RSSC over current state-of-the-art methods. The source codes of this method can be found at <uri>https://github.com/liuqingxin-chd/MF2CNet</uri>.

References

YearCitations

Page 1