Publication | Open Access
Bottom-Up Foreground-Aware Feature Fusion for Person Search
11
Citations
25
References
2020
Year
Unknown Venue
Convolutional Neural NetworkEngineeringMachine LearningHuman Pose EstimationBiometricsForeground RegionsImage AnalysisPattern RecognitionForeground Attention ModuleVideo TransformerVision RecognitionMachine VisionFeature LearningObject DetectionComputer ScienceDeep LearningFeature FusionComputer VisionHuman IdentificationEye TrackingPerson Search
The key to efficient person search is jointly localizing pedestrians and learning discriminative representation for person re-identification (re-ID). Some recently developed task-joint models are built with separate detection and re-ID branches on top of shared region feature extraction networks, where the large receptive field of neurons leads to background information redundancy for the following re-ID task. Our diagnostic analysis indicates the task-joint model suffers from considerable performance drop when the background is replaced or removed. In this work, we propose a subnet to fuse the bounding box features that pooled from multiple ConvNet stages in a bottom-up manner, termed bottom-up fusion (BUF) network. With a few parameters introduced, BUF leverages the multi-level features with different sizes of receptive fields to mitigate the background-bias problem. Moreover, the newly introduced segmentation head generates a foreground probability map as guidance for the network to focus on the foreground regions. The resulting foreground attention module (FAM) enhances the foreground features. Extensive experiments on PRW and CUHK-SYSU validate the effectiveness of the proposals. Our Bottom-Up Foreground-Aware Feature Fusion (BUFF) network achieves considerable gains over the state-of-the- arts on PRW and competitive performance on CUHK-SYSU.
| Year | Citations | |
|---|---|---|
Page 1
Page 1