Unsupervised Homography Estimation with Coplanarity-Aware GAN

Abstract

Estimating homography from an image pair is a fundamental problem in image alignment. Unsupervised learning methods have received increasing attention in this field due to their promising performance and label-free training. However, existing methods do not explicitly consider the problem of plane-induced parallax, which will make the predicted homography compromised on multiple planes. In this work, we propose a novel method HomoGAN to guide unsupervised homography estimation to focus on the dominant plane. First, a multi-scale transformer network is designed to predict homography from the feature pyramids of input images in a coarse-to-fine fashion. Moreover, we propose an unsupervised GAN to impose coplanarity constraint on the predicted homography, which is realized by using a generator to predict a mask of aligned regions, and then a discriminator to check if two masked feature maps are induced by a single homography. To validate the effectiveness of HomoGAN and its components, we conduct extensive experiments on a large-scale dataset, and results show that our matching error is 22% lower than the previous SOTA method. Code is available at https://github.com/megvii-research/HomoGAN

References

Page 1

	Year	Citations
Adam: A Method for Stochastic Optimization Diederik P. Kingma, Jimmy Ba UvA-DARE (University of Amsterdam) Artificial IntelligenceMathematical ProgrammingModel OptimizationMachine VisionMachine Learning	2014	84.5K
Distinctive Image Features from Scale-Invariant Keypoints David Lowe International Journal of Computer Vision Machine VisionImage AnalysisFeature DetectionEngineeringPattern Recognition	2004	54.6K
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows Ze Liu, Yutong Lin, Yue Cao, 2021 IEEE/CVF International Conference on Computer Vision (ICCV) Swin TransformerConvolutional Neural NetworkMachine VisionImage AnalysisMachine Learning	2021	27.9K
Random sample consensus Martin A. Fischler, Robert C. Bolles Communications of the ACM EngineeringRandom Sample ConsensusSampling TechniqueLocalizationRobust Feature	1981	24.9K
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, IEEE Transactions on Pattern Analysis and Machine Intelligence Semantic Image SegmentationConvolutional Neural NetworkScene AnalysisImage AnalysisMachine Learning	2017	21.4K
A flexible new technique for camera calibration Zheng Zhang IEEE Transactions on Pattern Analysis and Machine Intelligence EngineeringMeasurementField RoboticsLocalizationPlanar Pattern	2000	14.3K
Speeded-Up Robust Features (SURF) Herbert Bay, Andreas Ess, Tinne Tuytelaars, Computer Vision and Image Understanding Machine VisionImage AnalysisFeature DetectionEngineeringPattern Recognition	2008	13.2K
ORB: An efficient alternative to SIFT or SURF Ethan Rublee, Vincent Rabaud, Kurt Konolige, EngineeringFeature DetectionBiometricsRotation InvariantRobust Feature	2011	10.2K
ORB-SLAM: A Versatile and Accurate Monocular SLAM System IEEE Transactions on Robotics	2015	6.3K
Unsupervised Domain Adaptation by Backpropagation Yaroslav Ganin, Victor Lempitsky arXiv (Cornell University) Few-shot LearningTop-performing Deep ArchitecturesMachine VisionMachine LearningData Science	2014	2.6K

Page 1