No Cover Image

Journal article 13 views

From attributes to natural language: A survey and foresight on text-based person re-identification

Fanzhi Jiang Orcid Logo, Scott Yang Orcid Logo, Mark Jones Orcid Logo, Liumei Zhang.

Information Fusion, Start page: 102879

Swansea University Authors: Scott Yang Orcid Logo, Mark Jones Orcid Logo

Full text not available from this repository: check for access using links below.

Abstract

Text-based person re-identification (Re-ID) is a challenging topic in the field of complex multimodalanalysis, its ultimate aim is to recognize specific pedestrians by scrutinizing attributes/natural language descriptions. Despite the wide range of applicable areas such as security surveillance, vid...

Full description

Published in: Information Fusion
ISSN: 1566-2535
Published: Elsevier BV 2025
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa68609
first_indexed 2025-01-09T20:33:57Z
last_indexed 2025-01-10T02:33:12Z
id cronfa68609
recordtype SURis
fullrecord <?xml version="1.0"?><rfc1807><datestamp>2025-01-09T21:19:51.4895306</datestamp><bib-version>v2</bib-version><id>68609</id><entry>2024-12-20</entry><title>From attributes to natural language: A survey and foresight on text-based person re-identification</title><swanseaauthors><author><sid>81dc663ca0e68c60908d35b1d2ec3a9b</sid><ORCID>0000-0002-6618-7483</ORCID><firstname>Scott</firstname><surname>Yang</surname><name>Scott Yang</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>2e1030b6e14fc9debd5d5ae7cc335562</sid><ORCID>0000-0001-8991-1190</ORCID><firstname>Mark</firstname><surname>Jones</surname><name>Mark Jones</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2024-12-20</date><deptcode>MACS</deptcode><abstract>Text-based person re-identification (Re-ID) is a challenging topic in the field of complex multimodalanalysis, its ultimate aim is to recognize specific pedestrians by scrutinizing attributes/natural language descriptions. Despite the wide range of applicable areas such as security surveillance, video retrieval, person tracking, and social media analytics, there is a notable absence of comprehensive reviews dedicated to summarizing the text-based person Re-ID from a technical perspective. To address this gap, we propose to introduce a taxonomy spanning Evaluation, Strategy, Architecture, and Optimization dimensions, providing a comprehensive survey of the text-based person Re-ID task. We start by laying the groundwork for text-based person Re-ID, elucidating fundamental concepts related to attribute/natural language-based identification. Then a thorough examination of existing benchmark datasets and metrics is presented. Subsequently, we further delve into prevalent feature extraction strategies employed in text-based person Re-ID research, followed by a concise summary of common network architectures within the domain. Prevalent loss functions utilized for model optimization and modality alignment in text-based person Re-ID are also scrutinized. To conclude, we offer a concise summary of our findings, pinpointing challenges in text-based person Re-ID. In response to these challenges, we outline potential avenues for future open-set text-based person Re-ID and present a baseline architecture for text-based pedestrian image generation guided re-identification (TBPGR).</abstract><type>Journal Article</type><journal>Information Fusion</journal><volume/><journalNumber/><paginationStart>102879</paginationStart><paginationEnd/><publisher>Elsevier BV</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint>1566-2535</issnPrint><issnElectronic/><keywords/><publishedDay>1</publishedDay><publishedMonth>1</publishedMonth><publishedYear>2025</publishedYear><publishedDate>2025-01-01</publishedDate><doi>10.1016/j.inffus.2024.102879</doi><url>https://doi.org/10.1016/j.inffus.2024.102879</url><notes/><college>COLLEGE NANME</college><department>Mathematics and Computer Science School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MACS</DepartmentCode><institution>Swansea University</institution><apcterm/><funders>This document is the results of the research project funded by The Engineering and Physical Sciences Research Council of UK Research and Innovation (UKRI)</funders><projectreference/><lastEdited>2025-01-09T21:19:51.4895306</lastEdited><Created>2024-12-20T09:47:59.1951735</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Fanzhi</firstname><surname>Jiang</surname><orcid>0000-0001-7229-9732</orcid><order>1</order></author><author><firstname>Scott</firstname><surname>Yang</surname><orcid>0000-0002-6618-7483</orcid><order>2</order></author><author><firstname>Mark</firstname><surname>Jones</surname><orcid>0000-0001-8991-1190</orcid><order>3</order></author><author><firstname>Liumei</firstname><surname>Zhang.</surname><order>4</order></author></authors><documents/><OutputDurs/></rfc1807>
spelling 2025-01-09T21:19:51.4895306 v2 68609 2024-12-20 From attributes to natural language: A survey and foresight on text-based person re-identification 81dc663ca0e68c60908d35b1d2ec3a9b 0000-0002-6618-7483 Scott Yang Scott Yang true false 2e1030b6e14fc9debd5d5ae7cc335562 0000-0001-8991-1190 Mark Jones Mark Jones true false 2024-12-20 MACS Text-based person re-identification (Re-ID) is a challenging topic in the field of complex multimodalanalysis, its ultimate aim is to recognize specific pedestrians by scrutinizing attributes/natural language descriptions. Despite the wide range of applicable areas such as security surveillance, video retrieval, person tracking, and social media analytics, there is a notable absence of comprehensive reviews dedicated to summarizing the text-based person Re-ID from a technical perspective. To address this gap, we propose to introduce a taxonomy spanning Evaluation, Strategy, Architecture, and Optimization dimensions, providing a comprehensive survey of the text-based person Re-ID task. We start by laying the groundwork for text-based person Re-ID, elucidating fundamental concepts related to attribute/natural language-based identification. Then a thorough examination of existing benchmark datasets and metrics is presented. Subsequently, we further delve into prevalent feature extraction strategies employed in text-based person Re-ID research, followed by a concise summary of common network architectures within the domain. Prevalent loss functions utilized for model optimization and modality alignment in text-based person Re-ID are also scrutinized. To conclude, we offer a concise summary of our findings, pinpointing challenges in text-based person Re-ID. In response to these challenges, we outline potential avenues for future open-set text-based person Re-ID and present a baseline architecture for text-based pedestrian image generation guided re-identification (TBPGR). Journal Article Information Fusion 102879 Elsevier BV 1566-2535 1 1 2025 2025-01-01 10.1016/j.inffus.2024.102879 https://doi.org/10.1016/j.inffus.2024.102879 COLLEGE NANME Mathematics and Computer Science School COLLEGE CODE MACS Swansea University This document is the results of the research project funded by The Engineering and Physical Sciences Research Council of UK Research and Innovation (UKRI) 2025-01-09T21:19:51.4895306 2024-12-20T09:47:59.1951735 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Fanzhi Jiang 0000-0001-7229-9732 1 Scott Yang 0000-0002-6618-7483 2 Mark Jones 0000-0001-8991-1190 3 Liumei Zhang. 4
title From attributes to natural language: A survey and foresight on text-based person re-identification
spellingShingle From attributes to natural language: A survey and foresight on text-based person re-identification
Scott Yang
Mark Jones
title_short From attributes to natural language: A survey and foresight on text-based person re-identification
title_full From attributes to natural language: A survey and foresight on text-based person re-identification
title_fullStr From attributes to natural language: A survey and foresight on text-based person re-identification
title_full_unstemmed From attributes to natural language: A survey and foresight on text-based person re-identification
title_sort From attributes to natural language: A survey and foresight on text-based person re-identification
author_id_str_mv 81dc663ca0e68c60908d35b1d2ec3a9b
2e1030b6e14fc9debd5d5ae7cc335562
author_id_fullname_str_mv 81dc663ca0e68c60908d35b1d2ec3a9b_***_Scott Yang
2e1030b6e14fc9debd5d5ae7cc335562_***_Mark Jones
author Scott Yang
Mark Jones
author2 Fanzhi Jiang
Scott Yang
Mark Jones
Liumei Zhang.
format Journal article
container_title Information Fusion
container_start_page 102879
publishDate 2025
institution Swansea University
issn 1566-2535
doi_str_mv 10.1016/j.inffus.2024.102879
publisher Elsevier BV
college_str Faculty of Science and Engineering
hierarchytype
hierarchy_top_id facultyofscienceandengineering
hierarchy_top_title Faculty of Science and Engineering
hierarchy_parent_id facultyofscienceandengineering
hierarchy_parent_title Faculty of Science and Engineering
department_str School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science
url https://doi.org/10.1016/j.inffus.2024.102879
document_store_str 0
active_str 0
description Text-based person re-identification (Re-ID) is a challenging topic in the field of complex multimodalanalysis, its ultimate aim is to recognize specific pedestrians by scrutinizing attributes/natural language descriptions. Despite the wide range of applicable areas such as security surveillance, video retrieval, person tracking, and social media analytics, there is a notable absence of comprehensive reviews dedicated to summarizing the text-based person Re-ID from a technical perspective. To address this gap, we propose to introduce a taxonomy spanning Evaluation, Strategy, Architecture, and Optimization dimensions, providing a comprehensive survey of the text-based person Re-ID task. We start by laying the groundwork for text-based person Re-ID, elucidating fundamental concepts related to attribute/natural language-based identification. Then a thorough examination of existing benchmark datasets and metrics is presented. Subsequently, we further delve into prevalent feature extraction strategies employed in text-based person Re-ID research, followed by a concise summary of common network architectures within the domain. Prevalent loss functions utilized for model optimization and modality alignment in text-based person Re-ID are also scrutinized. To conclude, we offer a concise summary of our findings, pinpointing challenges in text-based person Re-ID. In response to these challenges, we outline potential avenues for future open-set text-based person Re-ID and present a baseline architecture for text-based pedestrian image generation guided re-identification (TBPGR).
published_date 2025-01-01T14:46:34Z
_version_ 1821417190614630400
score 11.247077