Concepedia

Publication | Closed Access

Confounded Expectations: Informedia at TRECVID 2004.

46

Citations

5

References

2004

Year

Abstract

For TRECVID 2004, CMU participated in the semantic feature extraction task, and the manual, interactive and automatic search tasks. For the semantic features classifiers, we tried unimodal, multi-modal and multi-concept classifiers. In interactive search, we compared a visual-only vs a complete video retrieval system using visual AND text data, and also contrasted expert vs novice users; The manual runs were similar to 2003, but they did not work comparably to last year, Additionally, we shared our low-level features with the TRECVID community. Overview We first describe the low-level features, which formed the input for all our analysis and which were distributed to other participants. Then we sketch out the experiments done for the semantic features, followed by the search task experiments (manual, automatic and interactive). Low-level ‘raw ’ features Low-level features are extracted for each shot. “Low-level ” means the features which are directly extracted from the source, videos. We use the term ‘low-level ’ to distinguish them from the TRECVID high-level semantic feature extraction task. The low-level features are derived from several different sources: visual, audio, text and ‘semantic ’ detectors, such as face detection and Video OCR detection. In TRECVID 2004, we extract 16 low-level raw features for the whole data set. This data was provided to all participating groups to encourage other researchers to use or compare their approaches with a standardized feature set. Image features A shot is the basic unit in our system; therefore, we extract one key-frame within each shot as a representative image. Image features are then based on the features extracted from that representative image. There are 3 different types of image features: color histograms, textures and edges. For all image features, we split the image into a 5 by 5 grid that tries to capture some spatial locality of information. The distributed data lists the features for each grid cell by rows, starting at the top.

References

YearCitations

Page 1