SmartCC: A Reinforcement Learning Approach for Multipath TCP Congestion Control in Heterogeneous Networks

TLDR

Multipath TCP extends conventional TCP to allow devices to use multiple paths simultaneously, but its congestion control suffers from bufferbloat and suboptimal bandwidth usage in heterogeneous networks. This work introduces SmartCC, a learning‑based congestion control scheme designed to adapt to the diverse characteristics of multiple communication paths in heterogeneous networks. SmartCC employs an asynchronous reinforcement‑learning framework that learns congestion rules via hierarchical tile coding for state aggregation and Q‑learning function estimation, decoupling training from execution to avoid additional delay or overhead. Experimental results demonstrate that SmartCC significantly increases aggregate throughput and outperforms existing state‑of‑the‑art mechanisms across multiple performance metrics.

Abstract

The Multipath TCP (MPTCP) protocol has been standardized by the IETF as an extension of conventional TCP, which enables multi-homed devices to establish multiple paths for simultaneous data transmission. Congestion control is a fundamental mechanism for the design and implementation of MPTCP. Due to the diverse QoS characteristics of heterogeneous links, existing multipath congestion control mechanisms suffer from a number of performance problems such as bufferbloat, suboptimal bandwidth usage, etc. In this paper, we propose a learning-based multipath congestion control approach called SmartCC to deal with the diversities of multiple communication path in heterogeneous networks. SmartCC adopts an asynchronous reinforcement learning framework to learn a set of congestion rules, which allows the sender to observe the environment and take actions to adjust the subflows' congestion windows adaptively to fit different network situations. To deal with the problem of infinite states in high-dimensional space, we propose a hierarchical tile coding algorithm for state aggregation and a function estimation approach for Q-learning, which can derive the optimal policy efficiently. Due to the asynchronous design of SmartCC, the processes of model training and execution are decoupled, and the learning process will not introduce extra delay and overhead on the decision making process in MPTCP congestion control. We conduct extensive experiments for performance evaluation, which show that SmartCC improves the aggregate throughput significantly and outperforms the state-of-the-art mechanisms on a variety of performance metrics.

References

Page 1

	Year	Citations

Page 1