Concepedia

Publication | Closed Access

Segmentation, categorization, and identification of commercial clips from TV streams using multimodal analysis

83

Citations

30

References

2006

Year

Abstract

TV advertising is ubiquitous, perseverant, and economically vital. Millions of people's living and working habits are affected by TV commercials. In this paper, we present a multimodal ("visual + audio + text") commercial video digest scheme to segment individual commercials and carry out semantic content analysis within a detected commercial segment from TV streams.Two challenging issues are addressed. Firstly, we propose a multimodal approach to robustly detect the boundaries of individual commercials. Secondly, we attempt to classify a commercial with respect to advertised products/services. For the first, the boundary detection of individual commercials is reduced to the problem of binary classification of shot boundaries via the mid-level features derived from two concepts: Image Frames Marked with Product Information (FMPI) and Audio Scene Change Indicator (ASCI). Moreover, the accurate individual boundary enables us to perform commercial identification by clip matching via a spatial-temporal signature. For the second, commercial classification is formulated as the task of text categorization by expanding sparse texts from ASR/OCR with external knowledge. Our boundary detection has achieved a good result of F1 = 93.7% on the dataset comprising 499 individual commercials from TRECVID'05 video corpus. Commercial classification has obtained a promising accuracy of 80.9% on 141 distinct ones. Based on these achievements, various applications such as an intelligent digital TV set-top box can be accomplished to enhance the TV viewer's capabilities in monitoring and managing commercials from TV streams.

References

YearCitations

Page 1