No Cover Image

Journal article 892 views 376 downloads

A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform

Manduhu Manduhu, Mark Jones Orcid Logo

IEEE Transactions on Image Processing, Volume: 28, Issue: 11, Pages: 5322 - 5335

Swansea University Author: Mark Jones Orcid Logo

Abstract

A fully-parallelized work-time optimal algorithm is presented for computing the exact Euclidean Distance Transform (EDT) of a 2D binary image with the size of n x n. Unlike existing PRAM and other algorithms, this algorithm is suitable for implementation on modern SIMD architectures such as GPUs. As...

Full description

Published in: IEEE Transactions on Image Processing
ISSN: 1057-7149 1941-0042
Published: 2019
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa50104
Tags: Add Tag
No Tags, Be the first to tag this record!
first_indexed 2019-05-09T20:00:51Z
last_indexed 2023-02-22T03:57:47Z
id cronfa50104
recordtype SURis
fullrecord <?xml version="1.0"?><rfc1807><datestamp>2023-02-21T16:12:43.3167501</datestamp><bib-version>v2</bib-version><id>50104</id><entry>2019-04-29</entry><title>A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform</title><swanseaauthors><author><sid>2e1030b6e14fc9debd5d5ae7cc335562</sid><ORCID>0000-0001-8991-1190</ORCID><firstname>Mark</firstname><surname>Jones</surname><name>Mark Jones</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2019-04-29</date><deptcode>SCS</deptcode><abstract>A fully-parallelized work-time optimal algorithm is presented for computing the exact Euclidean Distance Transform (EDT) of a 2D binary image with the size of n x n. Unlike existing PRAM and other algorithms, this algorithm is suitable for implementation on modern SIMD architectures such as GPUs. As a fundamental operation of 2D EDT, 1D EDT is efficiently parallelized first. Specifically, the GPU algorithm for the 1D EDT, which uses CUDA binary functions such as ballot(), ffs(), clz() and shfl(), runs in O(log_32n) time and performs O(n) work. Using the 1D EDT as a fundamental operation, the fully parallelized work-time optimal 2D EDT algorithm is designed. This algorithm consists of three steps. Step 1 of the algorithm runs in O(log_32n) time and performs O(N) (N=n^2) of total work on GPU. Step 2 performs O(N) of total work and has an expected time complexity of O(logn) on GPU. Step 3 runs in O(log_32n) time and performs O(N) of total work on GPU. As far as we know, this algorithm is the first fully-parallelized and realized work-time optimal algorithm for GPUs. Experimental results show that this algorithm outperforms prior state-of-the-art GPU algorithms.</abstract><type>Journal Article</type><journal>IEEE Transactions on Image Processing</journal><volume>28</volume><journalNumber>11</journalNumber><paginationStart>5322</paginationStart><paginationEnd>5335</paginationEnd><publisher/><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint>1057-7149</issnPrint><issnElectronic>1941-0042</issnElectronic><keywords/><publishedDay>20</publishedDay><publishedMonth>5</publishedMonth><publishedYear>2019</publishedYear><publishedDate>2019-05-20</publishedDate><doi>10.1109/TIP.2019.2916741</doi><url/><notes/><college>COLLEGE NANME</college><department>Computer Science</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>SCS</DepartmentCode><institution>Swansea University</institution><apcterm/><funders/><projectreference/><lastEdited>2023-02-21T16:12:43.3167501</lastEdited><Created>2019-04-29T10:01:47.7060198</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Manduhu</firstname><surname>Manduhu</surname><order>1</order></author><author><firstname>Mark</firstname><surname>Jones</surname><orcid>0000-0001-8991-1190</orcid><order>2</order></author></authors><documents><document><filename>0050104-07052019142411.pdf</filename><originalFilename>2019_ParallelEDT.pdf</originalFilename><uploaded>2019-05-07T14:24:11.1300000</uploaded><type>Output</type><contentLength>3078717</contentLength><contentType>application/pdf</contentType><version>Accepted Manuscript</version><cronfaStatus>true</cronfaStatus><embargoDate>2019-06-20T00:00:00.0000000</embargoDate><copyrightCorrect>true</copyrightCorrect><language>eng</language></document></documents><OutputDurs/></rfc1807>
spelling 2023-02-21T16:12:43.3167501 v2 50104 2019-04-29 A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform 2e1030b6e14fc9debd5d5ae7cc335562 0000-0001-8991-1190 Mark Jones Mark Jones true false 2019-04-29 SCS A fully-parallelized work-time optimal algorithm is presented for computing the exact Euclidean Distance Transform (EDT) of a 2D binary image with the size of n x n. Unlike existing PRAM and other algorithms, this algorithm is suitable for implementation on modern SIMD architectures such as GPUs. As a fundamental operation of 2D EDT, 1D EDT is efficiently parallelized first. Specifically, the GPU algorithm for the 1D EDT, which uses CUDA binary functions such as ballot(), ffs(), clz() and shfl(), runs in O(log_32n) time and performs O(n) work. Using the 1D EDT as a fundamental operation, the fully parallelized work-time optimal 2D EDT algorithm is designed. This algorithm consists of three steps. Step 1 of the algorithm runs in O(log_32n) time and performs O(N) (N=n^2) of total work on GPU. Step 2 performs O(N) of total work and has an expected time complexity of O(logn) on GPU. Step 3 runs in O(log_32n) time and performs O(N) of total work on GPU. As far as we know, this algorithm is the first fully-parallelized and realized work-time optimal algorithm for GPUs. Experimental results show that this algorithm outperforms prior state-of-the-art GPU algorithms. Journal Article IEEE Transactions on Image Processing 28 11 5322 5335 1057-7149 1941-0042 20 5 2019 2019-05-20 10.1109/TIP.2019.2916741 COLLEGE NANME Computer Science COLLEGE CODE SCS Swansea University 2023-02-21T16:12:43.3167501 2019-04-29T10:01:47.7060198 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Manduhu Manduhu 1 Mark Jones 0000-0001-8991-1190 2 0050104-07052019142411.pdf 2019_ParallelEDT.pdf 2019-05-07T14:24:11.1300000 Output 3078717 application/pdf Accepted Manuscript true 2019-06-20T00:00:00.0000000 true eng
title A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform
spellingShingle A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform
Mark Jones
title_short A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform
title_full A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform
title_fullStr A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform
title_full_unstemmed A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform
title_sort A Work Efficient Parallel Algorithm for Exact Euclidean Distance Transform
author_id_str_mv 2e1030b6e14fc9debd5d5ae7cc335562
author_id_fullname_str_mv 2e1030b6e14fc9debd5d5ae7cc335562_***_Mark Jones
author Mark Jones
author2 Manduhu Manduhu
Mark Jones
format Journal article
container_title IEEE Transactions on Image Processing
container_volume 28
container_issue 11
container_start_page 5322
publishDate 2019
institution Swansea University
issn 1057-7149
1941-0042
doi_str_mv 10.1109/TIP.2019.2916741
college_str Faculty of Science and Engineering
hierarchytype
hierarchy_top_id facultyofscienceandengineering
hierarchy_top_title Faculty of Science and Engineering
hierarchy_parent_id facultyofscienceandengineering
hierarchy_parent_title Faculty of Science and Engineering
department_str School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science
document_store_str 1
active_str 0
description A fully-parallelized work-time optimal algorithm is presented for computing the exact Euclidean Distance Transform (EDT) of a 2D binary image with the size of n x n. Unlike existing PRAM and other algorithms, this algorithm is suitable for implementation on modern SIMD architectures such as GPUs. As a fundamental operation of 2D EDT, 1D EDT is efficiently parallelized first. Specifically, the GPU algorithm for the 1D EDT, which uses CUDA binary functions such as ballot(), ffs(), clz() and shfl(), runs in O(log_32n) time and performs O(n) work. Using the 1D EDT as a fundamental operation, the fully parallelized work-time optimal 2D EDT algorithm is designed. This algorithm consists of three steps. Step 1 of the algorithm runs in O(log_32n) time and performs O(N) (N=n^2) of total work on GPU. Step 2 performs O(N) of total work and has an expected time complexity of O(logn) on GPU. Step 3 runs in O(log_32n) time and performs O(N) of total work on GPU. As far as we know, this algorithm is the first fully-parallelized and realized work-time optimal algorithm for GPUs. Experimental results show that this algorithm outperforms prior state-of-the-art GPU algorithms.
published_date 2019-05-20T04:01:25Z
_version_ 1763753156451237888
score 11.013148