EV-FlowNet: Self-Supervised Optical Flow Estimation for Event-based Cameras

TLDR

Event‑based cameras excel in high‑speed and high‑dynamic‑range scenarios where frame‑based cameras fail, yet algorithm development requires new hand‑crafted methods and the lack of labeled event data hampers supervised deep learning. This work introduces EV‑FlowNet, a self‑supervised deep‑learning pipeline designed to estimate optical flow directly from event streams. The approach encodes events into an image‑based representation fed to a neural network, while simultaneously using concurrently captured grayscale images as a supervisory signal to define a self‑supervised loss during training. Experiments demonstrate that the network produces accurate dense optical flow from events alone, achieving performance comparable to image‑based models and providing a transferable framework for other self‑supervised event‑based methods.

Abstract

Event-based cameras have shown great promise in a variety of situations where frame based cameras suffer, such as high speed motions and high dynamic range scenes. However, developing algorithms for event measurements requires a new class of hand crafted algorithms. Deep learning has shown great success in providing model free solutions to many problems in the vision community, but existing networks have been developed with frame based images in mind, and there does not exist the wealth of labeled data for events as there does for images for supervised training. To these points, we present EV-FlowNet, a novel self-supervised deep learning pipeline for optical flow estimation for event based cameras. In particular, we introduce an image based representation of a given event stream, which is fed into a self-supervised neural network as the sole input. The corresponding grayscale images captured from the same camera at the same time as the events are then used as a supervisory signal to provide a loss function at training time, given the estimated flow from the network. We show that the resulting network is able to accurately predict optical flow from events only in a variety of different scenes, with performance competitive to image based networks. This method not only allows for accurate estimation of dense optical flow, but also provides a framework for the transfer of other self-supervised methods to the event-based domain.

References

Page 1

	Year	Citations

Page 1