Publication | Closed Access
RDMA over Commodity Ethernet at Scale
433
Citations
31
References
2016
Year
Unknown Venue
Rdma Transport LivelockEngineeringHigh Performance Computer NetworkEdge ComputingNetwork Traffic ControlCloud ComputingComputer EngineeringComputer ArchitectureSystems EngineeringCommodity EthernetNetwork ManagementPfc-induced DeadlockData Center NetworkSure RdmaAdvanced NetworkingNetwork Interface Architecture
Over the past one and a half years, Microsoft has used RDMA over commodity Ethernet (RoCEv2) to support highly reliable, latency‑sensitive services. This paper describes the challenges encountered and the solutions devised to address them. We designed a DSCP‑based priority flow‑control mechanism to scale RoCEv2 beyond VLAN and built monitoring and management systems to ensure reliable operation. We resolved PFC‑induced deadlock, RDMA transport livelock, and NIC pause‑frame storms, showing that RoCEv2 can replace TCP for intra‑data‑center traffic with low latency, low CPU overhead, and high throughput.
Over the past one and half years, we have been using RDMA over commodity Ethernet (RoCEv2) to support some of Microsoft's highly-reliable, latency-sensitive services. This paper describes the challenges we encountered during the process and the solutions we devised to address them. In order to scale RoCEv2 beyond VLAN, we have designed a DSCP-based priority flow-control (PFC) mechanism to ensure large-scale deployment. We have addressed the safety challenges brought by PFC-induced deadlock (yes, it happened!), RDMA transport livelock, and the NIC PFC pause frame storm problem. We have also built the monitoring and management systems to make sure RDMA works as expected. Our experiences show that the safety and scalability issues of running RoCEv2 at scale can all be addressed, and RDMA can replace TCP for intra data center communications and achieve low latency, low CPU overhead, and high throughput.
| Year | Citations | |
|---|---|---|
Page 1
Page 1