Spatial-Temporal Transformer Networks for Traffic Flow Forecasting

TLDR

Traffic forecasting is essential for intelligent transportation systems, yet accurate long‑term prediction remains difficult because of highly nonlinear, dynamic spatial‑temporal dependencies. This work introduces Spatial‑Temporal Transformer Networks (STTNs) that exploit directed spatial and long‑range temporal dependencies to enhance long‑term traffic forecasting accuracy. STTNs combine a spatial transformer GNN that models directed spatial relations with self‑attention and multi‑head attention for diverse patterns, and a temporal transformer that captures bidirectional long‑range temporal dependencies, forming a joint spatial‑temporal block. Experiments on PeMS‑Bay and PeMSD7(M) show that STTNs train quickly and scalably, achieving competitive performance, especially for long‑term traffic flow prediction.

Abstract

Traffic forecasting has emerged as a core component of intelligent transportation systems. However, timely accurate traffic forecasting, especially long-term forecasting, still remains an open challenge due to the highly nonlinear and dynamic spatial-temporal dependencies of traffic flows. In this paper, we propose a novel paradigm of Spatial-Temporal Transformer Networks (STTNs) that leverages dynamical directed spatial dependencies and long-range temporal dependencies to improve the accuracy of long-term traffic forecasting. Specifically, we present a new variant of graph neural networks, named spatial transformer, by dynamically modeling directed spatial dependencies with self-attention mechanism to capture realtime traffic conditions as well as the directionality of traffic flows. Furthermore, different spatial dependency patterns can be jointly modeled with multi-heads attention mechanism to consider diverse relationships related to different factors (e.g. Similarity, connectivity and covariance). On the other hand, the temporal transformer is utilized to model long-range bidirectional temporal dependencies across multiple time steps. Finally, they are composed as a block to jointly model the spatial-temporal dependencies for accurate traffic prediction. Compared to existing works, the proposed model enables fast and scalable training over a long range spatial-temporal dependencies. Experiment results demonstrate that the proposed model achieves competitive results compared with the state-of-the-arts, especially forecasting long-term traffic flows on real-world PeMS-Bay and PeMSD7(M) datasets.

References

Page 1

	Year	Citations

Page 1