One-Shot Decoupled Face Reenactment with Vision Transformer

Hu, Chen; Xie, Xianghua

doi:10.1007/978-3-031-09282-4_21

Conference Paper/Proceeding/Abstract 877 views 119 downloads

One-Shot Decoupled Face Reenactment with Vision Transformer

Chen Hu, Xianghua Xie

Pattern Recognition and Artificial Intelligence, Volume: Lecture Notes in Computer Science (LNCS, volume 13364), Pages: 246 - 257

Swansea University Authors: Chen Hu, Xianghua Xie

PDF | Accepted Manuscript

Released with permission (chapter).
Download (662.94KB)

Check full text

DOI (Published version): 10.1007/978-3-031-09282-4_21

Abstract

Recent face reenactment paradigm involves estimating an optical flow to warp the source image or its feature maps such that pixel values can be sampled to generate the reenacted image. We propose a one-shot framework in which the reenactment of the overall face and individual landmarks are decoupled...

Full description

Published in:	Pattern Recognition and Artificial Intelligence
ISBN:	9783031092817 9783031092824
ISSN:	0302-9743 1611-3349
Published:	Cham Springer International Publishing 2022
Online Access:	Check full text
URI:	https://cronfa.swan.ac.uk/Record/cronfa59668

first_indexed	2022-03-18T11:36:05Z
last_indexed	2024-11-14T12:15:54Z
id	cronfa59668
recordtype	SURis
fullrecord	<?xml version="1.0"?><rfc1807><datestamp>2024-07-10T12:08:20.7377032</datestamp><bib-version>v2</bib-version><id>59668</id><entry>2022-03-18</entry><title>One-Shot Decoupled Face Reenactment with Vision Transformer</title><swanseaauthors><author><sid>55d3ba5f8378c2e3439d7e3962aee726</sid><firstname>Chen</firstname><surname>Hu</surname><name>Chen Hu</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>b334d40963c7a2f435f06d2c26c74e11</sid><ORCID>0000-0002-2701-8660</ORCID><firstname>Xianghua</firstname><surname>Xie</surname><name>Xianghua Xie</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2022-03-18</date><deptcode>MACS</deptcode><abstract>Recent face reenactment paradigm involves estimating an optical flow to warp the source image or its feature maps such that pixel values can be sampled to generate the reenacted image. We propose a one-shot framework in which the reenactment of the overall face and individual landmarks are decoupled. We show that a shallow Vision Transformer can effectively estimate optical flow without much parameters and training data. When reenacting different identities, our method remedies previous conditional generator based method’s inability to preserve identities in reenacted images. To address the identity preserving problem in face reenactment, we model landmark coordinate transformation as a style transfer problem, yielding further improvement on preserving the source image’s identity in the reenacted image. Our method achieves the lower head pose error on the CelebV dataset while obtaining competitive results in identity preserving and expression accuracy.</abstract><type>Conference Paper/Proceeding/Abstract</type><journal>Pattern Recognition and Artificial Intelligence</journal><volume>Lecture Notes in Computer Science (LNCS, volume 13364)</volume><journalNumber/><paginationStart>246</paginationStart><paginationEnd>257</paginationEnd><publisher>Springer International Publishing</publisher><placeOfPublication>Cham</placeOfPublication><isbnPrint>9783031092817</isbnPrint><isbnElectronic>9783031092824</isbnElectronic><issnPrint>0302-9743</issnPrint><issnElectronic>1611-3349</issnElectronic><keywords/><publishedDay>29</publishedDay><publishedMonth>5</publishedMonth><publishedYear>2022</publishedYear><publishedDate>2022-05-29</publishedDate><doi>10.1007/978-3-031-09282-4_21</doi><url/><notes>ICPRAI 2022. Lecture Notes in Computer Science, vol 13364..</notes><college>COLLEGE NANME</college><department>Mathematics and Computer Science School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MACS</DepartmentCode><institution>Swansea University</institution><apcterm/><funders/><projectreference/><lastEdited>2024-07-10T12:08:20.7377032</lastEdited><Created>2022-03-18T11:33:50.3894363</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Chen</firstname><surname>Hu</surname><order>1</order></author><author><firstname>Xianghua</firstname><surname>Xie</surname><orcid>0000-0002-2701-8660</orcid><order>2</order></author></authors><documents><document><filename>59668__24799__c8fb81d6c1bb4077a1518e1891c9a77a.pdf</filename><originalFilename>59668.pdf</originalFilename><uploaded>2022-08-02T10:23:23.7365840</uploaded><type>Output</type><contentLength>678846</contentLength><contentType>application/pdf</contentType><version>Accepted Manuscript</version><cronfaStatus>true</cronfaStatus><embargoDate>2023-05-29T00:00:00.0000000</embargoDate><documentNotes>Released with permission (chapter).</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language></document></documents><OutputDurs/></rfc1807>
spelling	2024-07-10T12:08:20.7377032 v2 59668 2022-03-18 One-Shot Decoupled Face Reenactment with Vision Transformer 55d3ba5f8378c2e3439d7e3962aee726 Chen Hu Chen Hu true false b334d40963c7a2f435f06d2c26c74e11 0000-0002-2701-8660 Xianghua Xie Xianghua Xie true false 2022-03-18 MACS Recent face reenactment paradigm involves estimating an optical flow to warp the source image or its feature maps such that pixel values can be sampled to generate the reenacted image. We propose a one-shot framework in which the reenactment of the overall face and individual landmarks are decoupled. We show that a shallow Vision Transformer can effectively estimate optical flow without much parameters and training data. When reenacting different identities, our method remedies previous conditional generator based method’s inability to preserve identities in reenacted images. To address the identity preserving problem in face reenactment, we model landmark coordinate transformation as a style transfer problem, yielding further improvement on preserving the source image’s identity in the reenacted image. Our method achieves the lower head pose error on the CelebV dataset while obtaining competitive results in identity preserving and expression accuracy. Conference Paper/Proceeding/Abstract Pattern Recognition and Artificial Intelligence Lecture Notes in Computer Science (LNCS, volume 13364) 246 257 Springer International Publishing Cham 9783031092817 9783031092824 0302-9743 1611-3349 29 5 2022 2022-05-29 10.1007/978-3-031-09282-4_21 ICPRAI 2022. Lecture Notes in Computer Science, vol 13364.. COLLEGE NANME Mathematics and Computer Science School COLLEGE CODE MACS Swansea University 2024-07-10T12:08:20.7377032 2022-03-18T11:33:50.3894363 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Chen Hu 1 Xianghua Xie 0000-0002-2701-8660 2 59668__24799__c8fb81d6c1bb4077a1518e1891c9a77a.pdf 59668.pdf 2022-08-02T10:23:23.7365840 Output 678846 application/pdf Accepted Manuscript true 2023-05-29T00:00:00.0000000 Released with permission (chapter). true eng
title	One-Shot Decoupled Face Reenactment with Vision Transformer
spellingShingle	One-Shot Decoupled Face Reenactment with Vision Transformer Chen Hu Xianghua Xie
title_short	One-Shot Decoupled Face Reenactment with Vision Transformer
title_full	One-Shot Decoupled Face Reenactment with Vision Transformer
title_fullStr	One-Shot Decoupled Face Reenactment with Vision Transformer
title_full_unstemmed	One-Shot Decoupled Face Reenactment with Vision Transformer
title_sort	One-Shot Decoupled Face Reenactment with Vision Transformer
author_id_str_mv	55d3ba5f8378c2e3439d7e3962aee726 b334d40963c7a2f435f06d2c26c74e11
author_id_fullname_str_mv	55d3ba5f8378c2e3439d7e3962aee726_*_Chen Hu b334d40963c7a2f435f06d2c26c74e11_*_Xianghua Xie
author	Chen Hu Xianghua Xie
author2	Chen Hu Xianghua Xie
format	Conference Paper/Proceeding/Abstract
container_title	Pattern Recognition and Artificial Intelligence
container_volume	Lecture Notes in Computer Science (LNCS, volume 13364)
container_start_page	246
publishDate	2022
institution	Swansea University
isbn	9783031092817 9783031092824
issn	0302-9743 1611-3349
doi_str_mv	10.1007/978-3-031-09282-4_21
publisher	Springer International Publishing
college_str	Faculty of Science and Engineering
hierarchytype
hierarchy_top_id	facultyofscienceandengineering
hierarchy_top_title	Faculty of Science and Engineering
hierarchy_parent_id	facultyofscienceandengineering
hierarchy_parent_title	Faculty of Science and Engineering
department_str	School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science
document_store_str	1
active_str	0
description	Recent face reenactment paradigm involves estimating an optical flow to warp the source image or its feature maps such that pixel values can be sampled to generate the reenacted image. We propose a one-shot framework in which the reenactment of the overall face and individual landmarks are decoupled. We show that a shallow Vision Transformer can effectively estimate optical flow without much parameters and training data. When reenacting different identities, our method remedies previous conditional generator based method’s inability to preserve identities in reenacted images. To address the identity preserving problem in face reenactment, we model landmark coordinate transformation as a style transfer problem, yielding further improvement on preserving the source image’s identity in the reenacted image. Our method achieves the lower head pose error on the CelebV dataset while obtaining competitive results in identity preserving and expression accuracy.
published_date	2022-05-29T04:58:36Z
_version_	1851639585869660160
score	11.089905

One-Shot Decoupled Face Reenactment with Vision Transformer

Similar Items