No Cover Image

E-Thesis 337 views 9 downloads

Delving into human visual attention for saliency detection of real-world images / AVISHEK SIRIS

Swansea University Author: AVISHEK SIRIS

DOI (Published version): 10.23889/SUthesis.60538

Abstract

Saliency detection explores the problem of identifying regions or objects that stand out from its surroundings. It is one of the fundamental problems in computer vision, with its appli-cation widely used in other graphics, vision and robotics tasks. Relative saliency ranking is a new problem that ha...

Full description

Published: Swansea 2022
Institution: Swansea University
Degree level: Doctoral
Degree name: Ph.D
Supervisor: Tam, Gary K.L.
URI: https://cronfa.swan.ac.uk/Record/cronfa60538
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract: Saliency detection explores the problem of identifying regions or objects that stand out from its surroundings. It is one of the fundamental problems in computer vision, with its appli-cation widely used in other graphics, vision and robotics tasks. Relative saliency ranking is a new problem that has been introduced with the idea of determining ranking based on the differences in the saliency agreement between multiple observers. This approach can lead to multiple objects being given the same saliency ranks. However, psychology studies and behavioural observations show that humans shift their attention from one location to another when viewing an image. This is due to the fact that the human visual system have limited capacity in simultaneously processing multiple visual inputs. We consider the sequential shift-ing of attention on objects as a form of saliency ranking, thus, we propose a new problem of saliency ranking based on attention shift. Although there are methods proposed for predicting saliency ranks, they are not able to model this human attention shift well. They are primarily based on ranking saliency values from binary prediction, which does not properly facilitate saliency rank reasoning between multiple individual objects. In this thesis, we aim to explore deep learning techniques for learning to rank salient objects by inferring human attention shift. We first construct a large-scale salient object ranking dataset. We define the saliency rank of objects by the order that an observer attends to these objects based on attention shift. We then propose a deep learning model that is built from bottom-up and top-down attention mechanisms for performing saliency ranking. Our model is evaluated with both quantitative and qualitative experiments, in which our proposed approach achieves state-of-the-art performance.Regarding traditional salient object detection, we observe two main issues that lead to recent techniques failing in real-world complex image scenes. Firstly, most existing datasets consist of images with simple foregrounds and backgrounds, and limited number of objects that hardly represent real-life scenarios. Second, current methods only learn contextual features of salient objects with binary saliency labels. This is not very sufficient for a model to learn high-level semantics for saliency reasoning in complex scenes. We begin to address these problems by constructing a new large-scale dataset with complex scenes rich in context. We then propose a context-aware saliency network that learns to explicitly exploit the semantic scene contexts of an image. We perform extensive experiments to demonstrate that our proposed network outperforms state-of-the-arts. The evaluation also show the effectiveness of leveraging high-level scene semantics for saliency detection in complex scenarios, while also transferring well to other existing datasets.
Item Description: ORCiD identifier: https://orcid.org/0000-0002-3064-2202
Keywords: Attention, Attention Shift, Saliency, Saliency Ranking, Salient Object Detection
College: Faculty of Science and Engineering