Publication | Closed Access
Deep correlation for matching images and text
426
Citations
51
References
2015
Year
Unknown Venue
EngineeringMachine LearningNatural Language ProcessingMultimodal LlmImage AnalysisText-to-image RetrievalData SciencePattern RecognitionText RecognitionVisual Question AnsweringMachine VisionFeature LearningDcca FrameworkVision Language ModelComputer ScienceImage SimilarityDeep LearningDcca ApproachComputer VisionDeep Neural NetworksDeep Correlation
This paper addresses the problem of matching images and captions in a joint latent space learnt with deep canonical correlation analysis (DCCA). The image and caption data are represented by the outputs of the vision and text based deep neural networks. The high dimensionality of the features presents a great challenge in terms of memory and speed complexity when used in DCCA framework. We address these problems by a GPU implementation and propose methods to deal with overfitting. This makes it possible to evaluate DCCA approach on popular caption-image matching benchmarks. We compare our approach to other recently proposed techniques and present state of the art results on three datasets.
| Year | Citations | |
|---|---|---|
Page 1
Page 1