Concepedia

Abstract

Remote sensing spatiotemporal prediction aims to infer future trends from historical spatiotemporal data, e.g., videos and time series images, has a broad application prospect in many fields. The foundation model is a promising research direction for spatiotemporal information mining because of its robust feature extraction capability, and has made rapid progress in natural scenes. Nevertheless, due to the spatially multi-scale and temporally multi-scale properties in remote sensing data, these methods still encounter bottlenecks when applied to remote sensing. Therefore, we propose a foundation model for remote sensing spatiotemporal prediction via spatiotemporal evolution decoupling, abbreviated as RingMo-Sense. Considering spatial affinity, temporal continuity, and spatiotemporal interaction, we construct spatial, temporal, and spatiotemporal triple-branch prediction networks. Specifically, we use parameter-sharing and progressive joint training strategies to achieve stable long-range prediction and parameter reduction simultaneously. In addition, we build a remote sensing spatiotemporal dataset by collecting various remote sensing videos and time series images. The experimental results on six downstream spatiotemporal tasks demonstrate that the proposed model yields competitive performance.

References

YearCitations

Page 1