University of Surrey

Test tubes in the lab Research in the ATI Dance Research

A saliency based framework for multi-modal registration.

Brown, Mark R. (2016) A saliency based framework for multi-modal registration. Doctoral thesis, University of Surrey.

thesis.pdf - Version of Record
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (33MB) | Preview


In recent years the Digital Film Production process has seen a huge increase in the amount of data captured, resulting in the need for automated tools within the pipeline. In particular, it typically involves the capture of multi-modal data such as 3D Light Detection And Ranging (LiDAR) scans, 2D images and videos, whose alignment and registration provide valuable information within the production process. There are significant challenges posed in this particular multi-modal registration problem that are not faced in the majority of feature-based registration pipelines. In particular, many existing feature detectors make modality-specific assumptions about the attributes a good, repeatable feature should possess, and as a result cannot be applied in a general, multi-modal manner. To combat this we take a saliency-based approach to feature detection that may be more meaningfully applied across modalities than other feature detectors. Furthermore, by extracting only the most salient features of a scene, significantly fewer features are obtained, resulting in a lower computational cost for the registration process. The first contribution of this thesis is a generalisation of the Kadir-Brady salient point detector. The generalisation allows for both a more robust alternative for 2D images, and a 3D extension, where in particular it may operate on both the geometry and texture of the scene. As a result, it allows for more meaningful multi-modal feature detection, and higher repeatability results are observed when compared to existing 2D-3D point feature detectors. The second contribution is the proposal of a novel salient line segment detector. By explicitly accounting for the surroundings of a line, the approach naturally avoids repetitive parts of a scene while detecting the strong, discriminative lines present. Its general, histogram-based framework allows for a natural extension to depth imagery and 3D, where lines are detected based jointly on both texture and geometry. The final contribution is centred around the registration phase, where a globally optimal solution to 2D-3D registration from points or lines based on a Branch-and-Bound (BnB) approach is proposed. Novel search procedures are proposed to speed up the algorithm, taking advantage of the special nested BnB structure used. The optimality properties of the proposed approach allow 2D-3D registration to be achieved for significantly higher rates of outliers compared to existing approaches.

Item Type: Thesis (Doctoral)
Subjects : Computer Vision
Divisions : Theses
Authors :
Brown, Mark
Date : 30 November 2016
Funders : Engineering and Physical Sciences Research Council
Grant Title : Engineering and Physical Sciences Research Council
Projects : EU ICT FP7 project IMPART
Contributors :
Depositing User : Mark Brown
Date Deposited : 15 Dec 2016 11:28
Last Modified : 11 Dec 2018 11:22

Actions (login required)

View Item View Item


Downloads per month over past year

Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800