VirtualWorlds as Proxy for Multi-object Tracking Analysis

TLDR

Modern computer vision algorithms require expensive data acquisition and manual labeling, but the small gap between real and virtual worlds allows virtual environments to assess how weather and imaging conditions affect recognition performance. The study aims to generate fully labeled, dynamic, photo‑realistic proxy virtual worlds and release the Virtual KITTI dataset to support computer vision research. The authors develop an efficient real‑to‑virtual cloning pipeline that produces the Virtual KITTI dataset with automatically generated ground truth for object detection, tracking, scene and instance segmentation, depth, and optical flow. Experiments show that models pre‑trained on real data behave similarly in virtual settings, that pre‑training on virtual data improves performance, and that weather and imaging conditions can drastically affect high‑performing tracking models.

Abstract

Modern computer vision algorithms typically require expensive data acquisition and accurate manual labeling. In this work, we instead leverage the recent progress in computer graphics to generate fully labeled, dynamic, and photo-realistic proxy virtual worlds. We propose an efficient real-to-virtual world cloning method, and validate our approach by building and publicly releasing a new video dataset, called "Virtual KITTI", automatically labeled with accurate ground truth for object detection, tracking, scene and instance segmentation, depth, and optical flow. We provide quantitative experimental evidence suggesting that (i) modern deep learning algorithms pre-trained on real data behave similarly in real and virtual worlds, and (ii) pre-training on virtual data improves performance. As the gap between real and virtual worlds is small, virtual worlds enable measuring the impact of various weather and imaging conditions on recognition performance, all other things being equal. We show these factors may affect drastically otherwise high-performing deep models for tracking.

References

Page 1

	Year	Citations

Page 1