Publication | Closed Access
Multimodal location estimation on Flickr videos
25
Citations
14
References
2011
Year
Unknown Venue
EngineeringVideo ProcessingMultimedia AnalysisVideo RetrievalLocalizationMediaeval 2010Text MiningImage AnalysisInformation RetrievalData SciencePattern RecognitionVideo Content AnalysisContent AnalysisMachine VisionMultimodal Location EstimationFlickr VideosVideo UnderstandingComputer VisionEye TrackingMotion AnalysisMultimedia SearchTextual Metadata
The following article describes an approach to determine the geo-coordinates of the recording place of Flickr videos based on both textual metadata and visual cues. The system is tested on the MediaEval 2010 Placing Task evaluation data, which consists of 5091 unfiltered test videos. The system presented in this article is less complex, uses less training data, and is at the same time more accurate than the best system presented in the evaluation in August 2010. The performance peaks at being able to classify 14% of the videos with less than 10m accuracy. The article describes the realization of the system, analyses of the different uses of multimodal cues and gazetteer information.
| Year | Citations | |
|---|---|---|
Page 1
Page 1