Publication | Closed Access
A Lightweight Multi-Scale Crossmodal Text-Image Retrieval Method in Remote Sensing
106
Citations
35
References
2021
Year
EngineeringMachine LearningImage RetrievalBiometricsMultimodal LearningImage SearchSemantic LocalizationImage AnalysisInformation RetrievalData ScienceText-to-image RetrievalPattern RecognitionMachine VisionGeographyComputer ScienceDeep LearningComputer VisionRs ImageRemote SensingContent-based Image RetrievalMultimedia Search
Remote sensing (RS) crossmodal text-image retrieval has become a research hotspot in recent years for its application in semantic localization. However, since multiple inferences on slices are demanded in semantic localization, designing a crossmodal retrieval model with less computation but well performance becomes an emergent and challenging task. In this article, considering the characteristics of multi-scale and target redundancy in RS, a concise but effective crossmodal retrieval model (LW-MCR) is designed. The proposed model incorporates multi-scale information and dynamically filters out redundant features when encoding RS image, while text features are obtained via lightweight group convolution. To improve the retrieval performance of LW-MCR, we come up with a novel hidden supervised optimization method based on knowledge distillation. This method enables the proposed model to acquire dark knowledge of the multi-level layers and representation layers in the teacher network, which significantly improves the accuracy of our lightweight model. Finally, on the basis of contrast learning, we present a method employing unlabeled data to boost the performance of RS retrieval model further. The experiment results on four RS image-text datasets demonstrate the efficiency of LW-MCR in RS crossmodal retrieval (RSCR) tasks. We have released some codes of the semantic localization and made it open to access at <uri>https://github.com/xiaoyuan1996/retrievalSystem</uri>.
| Year | Citations | |
|---|---|---|
Page 1
Page 1