Scalable Person Re-identification: A Benchmark

TLDR

Current person re‑identification datasets are limited in scale, rely on hand‑drawn bounding boxes, and provide only a single query image per identity, making them unrealistic for large‑scale settings; re‑identification is essentially a specialized image‑search task. This work introduces the Market‑1501 dataset to overcome these limitations and also proposes an unsupervised Bag‑of‑Words descriptor inspired by large‑scale image search. Market‑1501 contains over 32,000 annotated bounding boxes and a 500,000‑image distractor set, with images generated by a Deformable Part Model detector and collected in an open system that supplies multiple images per identity per camera, while the descriptor is built using unsupervised BoW techniques. The dataset is the largest person re‑identification benchmark to date, and experiments show that the proposed descriptor achieves competitive accuracy on VIPeR, CUHK03, and Market‑1501, and scales effectively to the 500k distractor set.

Abstract

This paper contributes a new high quality dataset for person re-identification, named "Market-1501". Generally, current datasets: 1) are limited in scale, 2) consist of hand-drawn bboxes, which are unavailable under realistic settings, 3) have only one ground truth and one query image for each identity (close environment). To tackle these problems, the proposed Market-1501 dataset is featured in three aspects. First, it contains over 32,000 annotated bboxes, plus a distractor set of over 500K images, making it the largest person re-id dataset to date. Second, images in Market-1501 dataset are produced using the Deformable Part Model (DPM) as pedestrian detector. Third, our dataset is collected in an open system, where each identity has multiple images under each camera. As a minor contribution, inspired by recent advances in large-scale image search, this paper proposes an unsupervised Bag-of-Words descriptor. We view person re-identification as a special task of image search. In experiment, we show that the proposed descriptor yields competitive accuracy on VIPeR, CUHK03, and Market-1501 datasets, and is scalable on the large-scale 500k dataset.

References

Page 1

	Year	Citations

Page 1