Publication | Open Access
Microsoft COCO Captions: Data Collection and Evaluation Server
1.6K
Citations
43
References
2015
Year
Artificial IntelligenceEngineeringMachine LearningNatural Language ProcessingCandidate CaptionsMultimodal LlmImage AnalysisText-to-image RetrievalData ScienceVisual GroundingMicrosoft Coco CaptionsMachine TranslationVision Language ModelComputer ScienceValidation ImagesDeep LearningImage CaptioningComputer VisionEvaluation ServerAutomatic Annotation
The paper introduces the Microsoft COCO Caption dataset and its evaluation server. The dataset comprises over 1.5 million captions for 330 k images, with five captions per training/validation image, and an evaluation server that scores submitted captions using BLEU, METEOR, ROUGE, and CIDEr metrics.
In this paper we describe the Microsoft COCO Caption dataset and evaluation server. When completed, the dataset will contain over one and a half million captions describing over 330,000 images. For the training and validation images, five independent human generated captions will be provided. To ensure consistency in evaluation of automatic caption generation algorithms, an evaluation server is used. The evaluation server receives candidate captions and scores them using several popular metrics, including BLEU, METEOR, ROUGE and CIDEr. Instructions for using the evaluation server are provided.
| Year | Citations | |
|---|---|---|
Page 1
Page 1