Publication | Closed Access
Fast Feature Pyramids for Object Detection
2K
Citations
66
References
2014
Year
Image ClassificationGeneral Object DetectionMachine VisionFeature DetectionImage AnalysisMachine LearningPattern RecognitionObject DetectionObject RecognitionData ScienceBiometricsMulti-resolution Image FeaturesComputer ScienceEngineeringDeep LearningFast Feature PyramidsVision RecognitionComputer Vision
Multi‑resolution image features can be approximated by extrapolation from nearby scales, avoiding the costly computation of features at every scale in a finely‑sampled image pyramid—a bottleneck in many modern detectors—and this approach is broadly applicable to vision algorithms requiring fine‑grained multi‑scale analysis. This insight enables the design of object‑detection algorithms that are as accurate as, and considerably faster than, the state‑of‑the‑art. The authors compute octave‑spaced feature pyramids and extrapolate to finely‑sampled scales, an inexpensive alternative to direct computation, and integrate fast feature pyramids into three visual‑recognition systems, demonstrating results on pedestrian and general‑object detection datasets. The approximation yields.
Multi-resolution image features may be approximated via extrapolation from nearby scales, rather than being computed explicitly. This fundamental insight allows us to design object detection algorithms that are as accurate, and considerably faster, than the state-of-the-art. The computational bottleneck of many modern detectors is the computation of features at every scale of a finely-sampled image pyramid. Our key insight is that one may compute finely sampled feature pyramids at a fraction of the cost, without sacrificing performance: for a broad family of features we find that features computed at octave-spaced scale intervals are sufficient to approximate features on a finely-sampled pyramid. Extrapolation is inexpensive as compared to direct feature computation. As a result, our approximation yields considerable speedups with negligible loss in detection accuracy. We modify three diverse visual recognition systems to use fast feature pyramids and show results on both pedestrian detection (measured on the Caltech, INRIA, TUD-Brussels and ETH data sets) and general object detection (measured on the PASCAL VOC). The approach is general and is widely applicable to vision algorithms requiring fine-grained multi-scale analysis. Our approximation is valid for images with broad spectra (most natural images) and fails for images with narrow band-pass spectra (e.g., periodic textures).
| Year | Citations | |
|---|---|---|
Page 1
Page 1