The MediaMill TRECVID 2009 Semantic Video Search Engine
Snoek, C, Sande, K, Rooij, O, Huurnink, B, Uijlings, J, Liempt, M, Bugalhoy, M, Trancosoy, I, Yan, F, Tahir, M , Mikolajczyk, K, Kittler, J, Rijke, M, Geusebroek, J, Gevers, T, Worring, M, Koelma, D and Smeulders, A (2009) The MediaMill TRECVID 2009 Semantic Video Search Engine
![]()
|
Text (licence)
SRI_deposit_agreement.pdf Download (33kB) |
|
![]()
|
Text
mediamill-TRECVID2009-final.pdf Available under License : See the attached licence file. Download (1MB) |
Abstract
In this paper we describe our TRECVID 2009 video re- trieval experiments. The MediaMill team participated in three tasks: concept detection, automatic search, and in- teractive search. The starting point for the MediaMill con- cept detection approach is our top-performing bag-of-words system of last year, which uses multiple color descriptors, codebooks with soft-assignment, and kernel-based supervised learning. We improve upon this baseline system by explor- ing two novel research directions. Firstly, we study a multi- modal extension by including 20 audio concepts and fusion using two novel multi-kernel supervised learning methods. Secondly, with the help of recently proposed algorithmic re- nements of bag-of-word representations, a GPU implemen- tation, and compute clusters, we scale-up the amount of vi- sual information analyzed by an order of magnitude, to a total of 1,000,000 i-frames. Our experiments evaluate the merit of these new components, ultimately leading to 64 ro- bust concept detectors for video retrieval. For retrieval, a robust but limited set of concept detectors justi es the need to rely on as many auxiliary information channels as pos- sible. For automatic search we therefore explore how we can learn to rank various information channels simultane- ously to maximize video search results for a given topic. To further improve the video retrieval results, our interactive search experiments investigate the roles of visualizing pre- view results for a certain browse-dimension and relevance feedback mechanisms that learn to solve complex search top- ics by analysis from user browsing behavior. The 2009 edi- tion of the TRECVID benchmark has again been a fruitful participation for the MediaMill team, resulting in the top ranking for both concept detection and interactive search. Again a lot has been learned during this year's TRECVID campaign; we highlight the most important lessons at the end of this paper.
Item Type: | Conference or Workshop Item (Conference Paper) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Divisions : | Faculty of Engineering and Physical Sciences > Electronic Engineering > Centre for Vision Speech and Signal Processing | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Authors : |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Date : | 2009 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Depositing User : | Symplectic Elements | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Date Deposited : | 14 Dec 2012 10:23 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Last Modified : | 31 Oct 2017 14:49 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
URI: | http://epubs.surrey.ac.uk/id/eprint/733282 |
Actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year