E-Thesis 520 views 23 downloads
Delving into human visual attention for saliency detection of real-world images / AVISHEK SIRIS
Swansea University Author: AVISHEK SIRIS
DOI (Published version): 10.23889/SUthesis.60538
Abstract
Saliency detection explores the problem of identifying regions or objects that stand out from its surroundings. It is one of the fundamental problems in computer vision, with its appli-cation widely used in other graphics, vision and robotics tasks. Relative saliency ranking is a new problem that ha...
Published: |
Swansea
2022
|
---|---|
Institution: | Swansea University |
Degree level: | Doctoral |
Degree name: | Ph.D |
Supervisor: | Tam, Gary K.L. |
URI: | https://cronfa.swan.ac.uk/Record/cronfa60538 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
first_indexed |
2022-07-19T16:16:07Z |
---|---|
last_indexed |
2023-01-13T19:20:44Z |
id |
cronfa60538 |
recordtype |
RisThesis |
fullrecord |
<?xml version="1.0" encoding="utf-8"?><rfc1807 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><bib-version>v2</bib-version><id>60538</id><entry>2022-07-19</entry><title>Delving into human visual attention for saliency detection of real-world images</title><swanseaauthors><author><sid>a7e009a3eb6ac7b910d8789bc283b60e</sid><firstname>AVISHEK</firstname><surname>SIRIS</surname><name>AVISHEK SIRIS</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2022-07-19</date><abstract>Saliency detection explores the problem of identifying regions or objects that stand out from its surroundings. It is one of the fundamental problems in computer vision, with its appli-cation widely used in other graphics, vision and robotics tasks. Relative saliency ranking is a new problem that has been introduced with the idea of determining ranking based on the differences in the saliency agreement between multiple observers. This approach can lead to multiple objects being given the same saliency ranks. However, psychology studies and behavioural observations show that humans shift their attention from one location to another when viewing an image. This is due to the fact that the human visual system have limited capacity in simultaneously processing multiple visual inputs. We consider the sequential shift-ing of attention on objects as a form of saliency ranking, thus, we propose a new problem of saliency ranking based on attention shift. Although there are methods proposed for predicting saliency ranks, they are not able to model this human attention shift well. They are primarily based on ranking saliency values from binary prediction, which does not properly facilitate saliency rank reasoning between multiple individual objects. In this thesis, we aim to explore deep learning techniques for learning to rank salient objects by inferring human attention shift. We first construct a large-scale salient object ranking dataset. We define the saliency rank of objects by the order that an observer attends to these objects based on attention shift. We then propose a deep learning model that is built from bottom-up and top-down attention mechanisms for performing saliency ranking. Our model is evaluated with both quantitative and qualitative experiments, in which our proposed approach achieves state-of-the-art performance.Regarding traditional salient object detection, we observe two main issues that lead to recent techniques failing in real-world complex image scenes. Firstly, most existing datasets consist of images with simple foregrounds and backgrounds, and limited number of objects that hardly represent real-life scenarios. Second, current methods only learn contextual features of salient objects with binary saliency labels. This is not very sufficient for a model to learn high-level semantics for saliency reasoning in complex scenes. We begin to address these problems by constructing a new large-scale dataset with complex scenes rich in context. We then propose a context-aware saliency network that learns to explicitly exploit the semantic scene contexts of an image. We perform extensive experiments to demonstrate that our proposed network outperforms state-of-the-arts. The evaluation also show the effectiveness of leveraging high-level scene semantics for saliency detection in complex scenarios, while also transferring well to other existing datasets.</abstract><type>E-Thesis</type><journal/><volume/><journalNumber/><paginationStart/><paginationEnd/><publisher/><placeOfPublication>Swansea</placeOfPublication><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic/><keywords>Attention, Attention Shift, Saliency, Saliency Ranking, Salient Object Detection</keywords><publishedDay>15</publishedDay><publishedMonth>7</publishedMonth><publishedYear>2022</publishedYear><publishedDate>2022-07-15</publishedDate><doi>10.23889/SUthesis.60538</doi><url/><notes>ORCiD identifier: https://orcid.org/0000-0002-3064-2202</notes><college>COLLEGE NANME</college><CollegeCode>COLLEGE CODE</CollegeCode><institution>Swansea University</institution><supervisor>Tam, Gary K.L.</supervisor><degreelevel>Doctoral</degreelevel><degreename>Ph.D</degreename><degreesponsorsfunders>Swansea Science DTC Postgraduate Research Scholarship</degreesponsorsfunders><apcterm/><funders/><projectreference/><lastEdited>2024-04-20T16:46:24.5416346</lastEdited><Created>2022-07-19T17:12:00.9698595</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>AVISHEK</firstname><surname>SIRIS</surname><order>1</order></author></authors><documents><document><filename>60538__24651__c12d67d6383a4d228ff8273b5f625a07.pdf</filename><originalFilename>Siris_Avishek_PhD_Thesis_Final_Embargoed_Redacted_Signature.pdf</originalFilename><uploaded>2022-07-19T17:28:55.6297149</uploaded><type>Output</type><contentLength>52406977</contentLength><contentType>application/pdf</contentType><version>E-Thesis – open access</version><cronfaStatus>true</cronfaStatus><embargoDate>2023-07-19T00:00:00.0000000</embargoDate><documentNotes>Copyright: The author, Avishek Siris, 2022.</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language></document></documents><OutputDurs/></rfc1807> |
spelling |
v2 60538 2022-07-19 Delving into human visual attention for saliency detection of real-world images a7e009a3eb6ac7b910d8789bc283b60e AVISHEK SIRIS AVISHEK SIRIS true false 2022-07-19 Saliency detection explores the problem of identifying regions or objects that stand out from its surroundings. It is one of the fundamental problems in computer vision, with its appli-cation widely used in other graphics, vision and robotics tasks. Relative saliency ranking is a new problem that has been introduced with the idea of determining ranking based on the differences in the saliency agreement between multiple observers. This approach can lead to multiple objects being given the same saliency ranks. However, psychology studies and behavioural observations show that humans shift their attention from one location to another when viewing an image. This is due to the fact that the human visual system have limited capacity in simultaneously processing multiple visual inputs. We consider the sequential shift-ing of attention on objects as a form of saliency ranking, thus, we propose a new problem of saliency ranking based on attention shift. Although there are methods proposed for predicting saliency ranks, they are not able to model this human attention shift well. They are primarily based on ranking saliency values from binary prediction, which does not properly facilitate saliency rank reasoning between multiple individual objects. In this thesis, we aim to explore deep learning techniques for learning to rank salient objects by inferring human attention shift. We first construct a large-scale salient object ranking dataset. We define the saliency rank of objects by the order that an observer attends to these objects based on attention shift. We then propose a deep learning model that is built from bottom-up and top-down attention mechanisms for performing saliency ranking. Our model is evaluated with both quantitative and qualitative experiments, in which our proposed approach achieves state-of-the-art performance.Regarding traditional salient object detection, we observe two main issues that lead to recent techniques failing in real-world complex image scenes. Firstly, most existing datasets consist of images with simple foregrounds and backgrounds, and limited number of objects that hardly represent real-life scenarios. Second, current methods only learn contextual features of salient objects with binary saliency labels. This is not very sufficient for a model to learn high-level semantics for saliency reasoning in complex scenes. We begin to address these problems by constructing a new large-scale dataset with complex scenes rich in context. We then propose a context-aware saliency network that learns to explicitly exploit the semantic scene contexts of an image. We perform extensive experiments to demonstrate that our proposed network outperforms state-of-the-arts. The evaluation also show the effectiveness of leveraging high-level scene semantics for saliency detection in complex scenarios, while also transferring well to other existing datasets. E-Thesis Swansea Attention, Attention Shift, Saliency, Saliency Ranking, Salient Object Detection 15 7 2022 2022-07-15 10.23889/SUthesis.60538 ORCiD identifier: https://orcid.org/0000-0002-3064-2202 COLLEGE NANME COLLEGE CODE Swansea University Tam, Gary K.L. Doctoral Ph.D Swansea Science DTC Postgraduate Research Scholarship 2024-04-20T16:46:24.5416346 2022-07-19T17:12:00.9698595 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science AVISHEK SIRIS 1 60538__24651__c12d67d6383a4d228ff8273b5f625a07.pdf Siris_Avishek_PhD_Thesis_Final_Embargoed_Redacted_Signature.pdf 2022-07-19T17:28:55.6297149 Output 52406977 application/pdf E-Thesis – open access true 2023-07-19T00:00:00.0000000 Copyright: The author, Avishek Siris, 2022. true eng |
title |
Delving into human visual attention for saliency detection of real-world images |
spellingShingle |
Delving into human visual attention for saliency detection of real-world images AVISHEK SIRIS |
title_short |
Delving into human visual attention for saliency detection of real-world images |
title_full |
Delving into human visual attention for saliency detection of real-world images |
title_fullStr |
Delving into human visual attention for saliency detection of real-world images |
title_full_unstemmed |
Delving into human visual attention for saliency detection of real-world images |
title_sort |
Delving into human visual attention for saliency detection of real-world images |
author_id_str_mv |
a7e009a3eb6ac7b910d8789bc283b60e |
author_id_fullname_str_mv |
a7e009a3eb6ac7b910d8789bc283b60e_***_AVISHEK SIRIS |
author |
AVISHEK SIRIS |
author2 |
AVISHEK SIRIS |
format |
E-Thesis |
publishDate |
2022 |
institution |
Swansea University |
doi_str_mv |
10.23889/SUthesis.60538 |
college_str |
Faculty of Science and Engineering |
hierarchytype |
|
hierarchy_top_id |
facultyofscienceandengineering |
hierarchy_top_title |
Faculty of Science and Engineering |
hierarchy_parent_id |
facultyofscienceandengineering |
hierarchy_parent_title |
Faculty of Science and Engineering |
department_str |
School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science |
document_store_str |
1 |
active_str |
0 |
description |
Saliency detection explores the problem of identifying regions or objects that stand out from its surroundings. It is one of the fundamental problems in computer vision, with its appli-cation widely used in other graphics, vision and robotics tasks. Relative saliency ranking is a new problem that has been introduced with the idea of determining ranking based on the differences in the saliency agreement between multiple observers. This approach can lead to multiple objects being given the same saliency ranks. However, psychology studies and behavioural observations show that humans shift their attention from one location to another when viewing an image. This is due to the fact that the human visual system have limited capacity in simultaneously processing multiple visual inputs. We consider the sequential shift-ing of attention on objects as a form of saliency ranking, thus, we propose a new problem of saliency ranking based on attention shift. Although there are methods proposed for predicting saliency ranks, they are not able to model this human attention shift well. They are primarily based on ranking saliency values from binary prediction, which does not properly facilitate saliency rank reasoning between multiple individual objects. In this thesis, we aim to explore deep learning techniques for learning to rank salient objects by inferring human attention shift. We first construct a large-scale salient object ranking dataset. We define the saliency rank of objects by the order that an observer attends to these objects based on attention shift. We then propose a deep learning model that is built from bottom-up and top-down attention mechanisms for performing saliency ranking. Our model is evaluated with both quantitative and qualitative experiments, in which our proposed approach achieves state-of-the-art performance.Regarding traditional salient object detection, we observe two main issues that lead to recent techniques failing in real-world complex image scenes. Firstly, most existing datasets consist of images with simple foregrounds and backgrounds, and limited number of objects that hardly represent real-life scenarios. Second, current methods only learn contextual features of salient objects with binary saliency labels. This is not very sufficient for a model to learn high-level semantics for saliency reasoning in complex scenes. We begin to address these problems by constructing a new large-scale dataset with complex scenes rich in context. We then propose a context-aware saliency network that learns to explicitly exploit the semantic scene contexts of an image. We perform extensive experiments to demonstrate that our proposed network outperforms state-of-the-arts. The evaluation also show the effectiveness of leveraging high-level scene semantics for saliency detection in complex scenarios, while also transferring well to other existing datasets. |
published_date |
2022-07-15T16:46:20Z |
_version_ |
1796869173254553600 |
score |
11.037056 |