Publication | Closed Access
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
284
Citations
53
References
2016
Year
Natural Language ProcessingMultimodal LlmImage AnalysisEngineeringData ScienceText-to-image RetrievalVision Language ModelFlickr30k EntitiesDeep LearningComputer VisionMachine Translation
| Year | Citations | |
|---|---|---|
Page 1
Page 1