University of Surrey

Test tubes in the lab Research in the ATI Dance Research

A picture is worth a thousand tags: Automatic web based image tag expansion

Gilbert, Andrew and Bowden, Richard (2013) A picture is worth a thousand tags: Automatic web based image tag expansion Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7725 L (PART 2). pp. 447-460.

Available under License : See the attached licence file.

Download (9MB)
Text (licence)

Download (33kB)


We present an approach to automatically expand the annotation of images using the internet as an additional information source. The novelty of the work is in the expansion of image tags by automatically introducing new unseen complex linguistic labels which are collected unsupervised from associated webpages. Taking a small subset of existing image tags, a web based search retrieves additional textual information. Both a textual bag of words model and a visual bag of words model are combined and symbolised for data mining. Association rule mining is then used to identify rules which relate words to visual contents. Unseen images that fit these rules are re-tagged. This approach allows a large number of additional annotations to be added to unseen images, on average 12.8 new tags per image, with an 87.2% true positive rate. Results are shown on two datasets including a new 2800 image annotation dataset of landmarks, the results include pictures of buildings being tagged with the architect, the year of construction and even events that have taken place there. This widens the tag annotation impact and their use in retrieval. This dataset is made available along with tags and the 1970 webpages and additional images which form the information corpus. In addition, results for a common state-of-the-art dataset MIRFlickr25000 are presented for comparison of the learning framework against previous works. © 2013 Springer-Verlag.

Item Type: Article
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering > Centre for Communication Systems Research
Authors :
Date : 2013
DOI : 10.1007/978-3-642-37444-9_35
Copyright Disclaimer : The original publication is available at <a href=""></a>
Depositing User : Symplectic Elements
Date Deposited : 14 Oct 2013 15:35
Last Modified : 16 Jan 2019 16:47

Actions (login required)

View Item View Item


Downloads per month over past year

Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800