Journal article 154 views 41 downloads
From attributes to natural language: A survey and foresight on text-based person re-identification
Information Fusion, Volume: 118, Start page: 102879
Swansea University Authors:
FANZHI JIANG, Scott Yang , Mark Jones
-
PDF | Accepted Manuscript
Author accepted manuscript document released under the terms of a Creative Commons CC-BY licence using the Swansea University Research Publications Policy (rights retention).
Download (1.96MB)
DOI (Published version): 10.1016/j.inffus.2024.102879
Abstract
Text-based person re-identification (Re-ID) is a challenging topic in the field of complex multimodalanalysis, its ultimate aim is to recognize specific pedestrians by scrutinizing attributes/natural language descriptions. Despite the wide range of applicable areas such as security surveillance, vid...
Published in: | Information Fusion |
---|---|
ISSN: | 1566-2535 1872-6305 |
Published: |
Elsevier BV
2025
|
Online Access: |
Check full text
|
URI: | https://cronfa.swan.ac.uk/Record/cronfa68609 |
first_indexed |
2025-01-09T20:33:57Z |
---|---|
last_indexed |
2025-02-19T07:28:21Z |
id |
cronfa68609 |
recordtype |
SURis |
fullrecord |
<?xml version="1.0"?><rfc1807><datestamp>2025-02-18T10:38:29.9681605</datestamp><bib-version>v2</bib-version><id>68609</id><entry>2024-12-20</entry><title>From attributes to natural language: A survey and foresight on text-based person re-identification</title><swanseaauthors><author><sid>d3dcbe2b549acd06da61c3f2d52847d7</sid><firstname>FANZHI</firstname><surname>JIANG</surname><name>FANZHI JIANG</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>81dc663ca0e68c60908d35b1d2ec3a9b</sid><ORCID>0000-0002-6618-7483</ORCID><firstname>Scott</firstname><surname>Yang</surname><name>Scott Yang</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>2e1030b6e14fc9debd5d5ae7cc335562</sid><ORCID>0000-0001-8991-1190</ORCID><firstname>Mark</firstname><surname>Jones</surname><name>Mark Jones</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2024-12-20</date><abstract>Text-based person re-identification (Re-ID) is a challenging topic in the field of complex multimodalanalysis, its ultimate aim is to recognize specific pedestrians by scrutinizing attributes/natural language descriptions. Despite the wide range of applicable areas such as security surveillance, video retrieval, person tracking, and social media analytics, there is a notable absence of comprehensive reviews dedicated to summarizing the text-based person Re-ID from a technical perspective. To address this gap, we propose to introduce a taxonomy spanning Evaluation, Strategy, Architecture, and Optimization dimensions, providing a comprehensive survey of the text-based person Re-ID task. We start by laying the groundwork for text-based person Re-ID, elucidating fundamental concepts related to attribute/natural language-based identification. Then a thorough examination of existing benchmark datasets and metrics is presented. Subsequently, we further delve into prevalent feature extraction strategies employed in text-based person Re-ID research, followed by a concise summary of common network architectures within the domain. Prevalent loss functions utilized for model optimization and modality alignment in text-based person Re-ID are also scrutinized. To conclude, we offer a concise summary of our findings, pinpointing challenges in text-based person Re-ID. In response to these challenges, we outline potential avenues for future open-set text-based person Re-ID and present a baseline architecture for text-based pedestrian image generation guided re-identification (TBPGR).</abstract><type>Journal Article</type><journal>Information Fusion</journal><volume>118</volume><journalNumber/><paginationStart>102879</paginationStart><paginationEnd/><publisher>Elsevier BV</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint>1566-2535</issnPrint><issnElectronic>1872-6305</issnElectronic><keywords>Person re-identification; Text; Natural language; Attributes; Diffusion model</keywords><publishedDay>1</publishedDay><publishedMonth>6</publishedMonth><publishedYear>2025</publishedYear><publishedDate>2025-06-01</publishedDate><doi>10.1016/j.inffus.2024.102879</doi><url/><notes/><college>COLLEGE NANME</college><CollegeCode>COLLEGE CODE</CollegeCode><institution>Swansea University</institution><apcterm>Not Required</apcterm><funders>This document is the results of the research project funded by The Engineering and Physical Sciences Research Council of UK Research and Innovation (UKRI)</funders><projectreference/><lastEdited>2025-02-18T10:38:29.9681605</lastEdited><Created>2024-12-20T09:47:59.1951735</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>FANZHI</firstname><surname>JIANG</surname><order>1</order></author><author><firstname>Scott</firstname><surname>Yang</surname><orcid>0000-0002-6618-7483</orcid><order>2</order></author><author><firstname>Mark</firstname><surname>Jones</surname><orcid>0000-0001-8991-1190</orcid><order>3</order></author><author><firstname>Liumei</firstname><surname>Zhang</surname><orcid>0000-0002-1834-5424</orcid><order>4</order></author></authors><documents><document><filename>68609__33429__d006d633bb4e4144952e2fbcbd2302d4.pdf</filename><originalFilename>Text-based person re-identification_PrePrint Accepted Version.pdf</originalFilename><uploaded>2025-01-28T16:44:23.7028240</uploaded><type>Output</type><contentLength>2059077</contentLength><contentType>application/pdf</contentType><version>Accepted Manuscript</version><cronfaStatus>true</cronfaStatus><documentNotes>Author accepted manuscript document released under the terms of a Creative Commons CC-BY licence using the Swansea University Research Publications Policy (rights retention).</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>https://creativecommons.org/licenses/by/4.0/</licence></document></documents><OutputDurs/></rfc1807> |
spelling |
2025-02-18T10:38:29.9681605 v2 68609 2024-12-20 From attributes to natural language: A survey and foresight on text-based person re-identification d3dcbe2b549acd06da61c3f2d52847d7 FANZHI JIANG FANZHI JIANG true false 81dc663ca0e68c60908d35b1d2ec3a9b 0000-0002-6618-7483 Scott Yang Scott Yang true false 2e1030b6e14fc9debd5d5ae7cc335562 0000-0001-8991-1190 Mark Jones Mark Jones true false 2024-12-20 Text-based person re-identification (Re-ID) is a challenging topic in the field of complex multimodalanalysis, its ultimate aim is to recognize specific pedestrians by scrutinizing attributes/natural language descriptions. Despite the wide range of applicable areas such as security surveillance, video retrieval, person tracking, and social media analytics, there is a notable absence of comprehensive reviews dedicated to summarizing the text-based person Re-ID from a technical perspective. To address this gap, we propose to introduce a taxonomy spanning Evaluation, Strategy, Architecture, and Optimization dimensions, providing a comprehensive survey of the text-based person Re-ID task. We start by laying the groundwork for text-based person Re-ID, elucidating fundamental concepts related to attribute/natural language-based identification. Then a thorough examination of existing benchmark datasets and metrics is presented. Subsequently, we further delve into prevalent feature extraction strategies employed in text-based person Re-ID research, followed by a concise summary of common network architectures within the domain. Prevalent loss functions utilized for model optimization and modality alignment in text-based person Re-ID are also scrutinized. To conclude, we offer a concise summary of our findings, pinpointing challenges in text-based person Re-ID. In response to these challenges, we outline potential avenues for future open-set text-based person Re-ID and present a baseline architecture for text-based pedestrian image generation guided re-identification (TBPGR). Journal Article Information Fusion 118 102879 Elsevier BV 1566-2535 1872-6305 Person re-identification; Text; Natural language; Attributes; Diffusion model 1 6 2025 2025-06-01 10.1016/j.inffus.2024.102879 COLLEGE NANME COLLEGE CODE Swansea University Not Required This document is the results of the research project funded by The Engineering and Physical Sciences Research Council of UK Research and Innovation (UKRI) 2025-02-18T10:38:29.9681605 2024-12-20T09:47:59.1951735 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science FANZHI JIANG 1 Scott Yang 0000-0002-6618-7483 2 Mark Jones 0000-0001-8991-1190 3 Liumei Zhang 0000-0002-1834-5424 4 68609__33429__d006d633bb4e4144952e2fbcbd2302d4.pdf Text-based person re-identification_PrePrint Accepted Version.pdf 2025-01-28T16:44:23.7028240 Output 2059077 application/pdf Accepted Manuscript true Author accepted manuscript document released under the terms of a Creative Commons CC-BY licence using the Swansea University Research Publications Policy (rights retention). true eng https://creativecommons.org/licenses/by/4.0/ |
title |
From attributes to natural language: A survey and foresight on text-based person re-identification |
spellingShingle |
From attributes to natural language: A survey and foresight on text-based person re-identification FANZHI JIANG Scott Yang Mark Jones |
title_short |
From attributes to natural language: A survey and foresight on text-based person re-identification |
title_full |
From attributes to natural language: A survey and foresight on text-based person re-identification |
title_fullStr |
From attributes to natural language: A survey and foresight on text-based person re-identification |
title_full_unstemmed |
From attributes to natural language: A survey and foresight on text-based person re-identification |
title_sort |
From attributes to natural language: A survey and foresight on text-based person re-identification |
author_id_str_mv |
d3dcbe2b549acd06da61c3f2d52847d7 81dc663ca0e68c60908d35b1d2ec3a9b 2e1030b6e14fc9debd5d5ae7cc335562 |
author_id_fullname_str_mv |
d3dcbe2b549acd06da61c3f2d52847d7_***_FANZHI JIANG 81dc663ca0e68c60908d35b1d2ec3a9b_***_Scott Yang 2e1030b6e14fc9debd5d5ae7cc335562_***_Mark Jones |
author |
FANZHI JIANG Scott Yang Mark Jones |
author2 |
FANZHI JIANG Scott Yang Mark Jones Liumei Zhang |
format |
Journal article |
container_title |
Information Fusion |
container_volume |
118 |
container_start_page |
102879 |
publishDate |
2025 |
institution |
Swansea University |
issn |
1566-2535 1872-6305 |
doi_str_mv |
10.1016/j.inffus.2024.102879 |
publisher |
Elsevier BV |
college_str |
Faculty of Science and Engineering |
hierarchytype |
|
hierarchy_top_id |
facultyofscienceandengineering |
hierarchy_top_title |
Faculty of Science and Engineering |
hierarchy_parent_id |
facultyofscienceandengineering |
hierarchy_parent_title |
Faculty of Science and Engineering |
department_str |
School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science |
document_store_str |
1 |
active_str |
0 |
description |
Text-based person re-identification (Re-ID) is a challenging topic in the field of complex multimodalanalysis, its ultimate aim is to recognize specific pedestrians by scrutinizing attributes/natural language descriptions. Despite the wide range of applicable areas such as security surveillance, video retrieval, person tracking, and social media analytics, there is a notable absence of comprehensive reviews dedicated to summarizing the text-based person Re-ID from a technical perspective. To address this gap, we propose to introduce a taxonomy spanning Evaluation, Strategy, Architecture, and Optimization dimensions, providing a comprehensive survey of the text-based person Re-ID task. We start by laying the groundwork for text-based person Re-ID, elucidating fundamental concepts related to attribute/natural language-based identification. Then a thorough examination of existing benchmark datasets and metrics is presented. Subsequently, we further delve into prevalent feature extraction strategies employed in text-based person Re-ID research, followed by a concise summary of common network architectures within the domain. Prevalent loss functions utilized for model optimization and modality alignment in text-based person Re-ID are also scrutinized. To conclude, we offer a concise summary of our findings, pinpointing challenges in text-based person Re-ID. In response to these challenges, we outline potential avenues for future open-set text-based person Re-ID and present a baseline architecture for text-based pedestrian image generation guided re-identification (TBPGR). |
published_date |
2025-06-01T08:16:28Z |
_version_ |
1827915629421658112 |
score |
11.055693 |