The MediaMill TRECVID 2009 Semantic Video Search Engine
Snoek, C, Sande, K, Rooij, O, Huurnink, B, Uijlings, J, Liempt, M, Bugalhoy, M, Trancosoy, I, Yan, F, Tahir, M, Mikolajczyk, K, Kittler, J, Rijke, M, Geusebroek, J, Gevers, T, Worring, M, Koelma, D and Smeulders, A (2009) The MediaMill TRECVID 2009 Semantic Video Search Engine
Available under License : See the attached licence file.
In this paper we describe our TRECVID 2009 video re- trieval experiments. The MediaMill team participated in three tasks: concept detection, automatic search, and in- teractive search. The starting point for the MediaMill con- cept detection approach is our top-performing bag-of-words system of last year, which uses multiple color descriptors, codebooks with soft-assignment, and kernel-based supervised learning. We improve upon this baseline system by explor- ing two novel research directions. Firstly, we study a multi- modal extension by including 20 audio concepts and fusion using two novel multi-kernel supervised learning methods. Secondly, with the help of recently proposed algorithmic re- nements of bag-of-word representations, a GPU implemen- tation, and compute clusters, we scale-up the amount of vi- sual information analyzed by an order of magnitude, to a total of 1,000,000 i-frames. Our experiments evaluate the merit of these new components, ultimately leading to 64 ro- bust concept detectors for video retrieval. For retrieval, a robust but limited set of concept detectors justi es the need to rely on as many auxiliary information channels as pos- sible. For automatic search we therefore explore how we can learn to rank various information channels simultane- ously to maximize video search results for a given topic. To further improve the video retrieval results, our interactive search experiments investigate the roles of visualizing pre- view results for a certain browse-dimension and relevance feedback mechanisms that learn to solve complex search top- ics by analysis from user browsing behavior. The 2009 edi- tion of the TRECVID benchmark has again been a fruitful participation for the MediaMill team, resulting in the top ranking for both concept detection and interactive search. Again a lot has been learned during this year's TRECVID campaign; we highlight the most important lessons at the end of this paper.
|Item Type:||Conference or Workshop Item (Conference Paper)|
|Divisions :||Faculty of Engineering and Physical Sciences > Electronic Engineering > Centre for Vision Speech and Signal Processing|
|Depositing User :||Symplectic Elements|
|Date Deposited :||14 Dec 2012 10:23|
|Last Modified :||09 Jun 2014 13:14|
Actions (login required)
Downloads per month over past year