GPUvm: GPU Virtualization at the Hypervisor

TLDR

GPUs provide massively parallel compute and high throughput, yet the tradeoffs of hypervisor‑level virtualization for discrete GPUs remain unclear due to limited designs and quantitative evaluations. The study aims to clarify the tradeoffs and technical requirements of hypervisor‑level GPU virtualization to guide solution development. GPUvm is an open hypervisor‑level GPU virtualization architecture built on Xen that offers full, naive para‑, and high‑performance para‑virtualization modes and exposes low‑ and high‑level interfaces such as memory‑mapped I/O and DRM APIs to guest VMs. Experiments on a commodity GPU revealed that overhead varies with the exposed interface level, and that coarse‑grained GPU fairness among multiple VMs can be achieved through scheduling.

Abstract

Graphic processing units (GPUs) provide a massively-parallel computational power and encourage the use of general-purpose computing on GPUs (GPGPU). The distinguished design of <i>discrete GPUs</i> helps them to provide the high throughput, scalability, and energy efficiency needed for GPGPU applications. Despite the previous study on GPU virtualization, the tradeoffs between the virtualization approaches remain unclear, because of a lack of designs for or quantitative evaluations of the hypervisor-level virtualization for discrete GPUs. Shedding light on these tradeoffs and the technical requirements for the hypervisor-level virtualization would facilitate the development of an appropriate GPU virtualization solution. <italic/> <inline-formula><tex-math notation="LaTeX"> $\sf{GPUvm}$</tex-math> </inline-formula> <italic/> , which is an open architecture for hypervisor-level GPU virtualization with a particular emphasis on using the Xen hypervisor, is presented in this paper. <inline-formula><tex-math notation="LaTeX"> $\sf{GPUvm}$</tex-math> </inline-formula> offers three virtualization modes: the full-, naive para-, and high-performance para-virtualization. <inline-formula><tex-math notation="LaTeX">$\sf{GPUvm}$</tex-math></inline-formula> exposes low- and high-level interfaces such as memory-mapped I/O and DRM APIs to the guest virtual machines (VMs). Our experiments using a relevant commodity GPU showed that <inline-formula><tex-math notation="LaTeX">$\sf{GPUvm}$</tex-math></inline-formula> incurs different overheads as the level of the exposed interfaces is changed. The results also showed that a coarse-grained fairness on the GPU among multiple VMs can be achieved using GPU scheduling.

References

Page 1

	Year	Citations

Page 1