Journal article 917 views 281 downloads
Efficient Memory Arbitration in High-Level Synthesis From Multi-Threaded Code
Jianyi Cheng
,
Shane Fleming,
Yu Ting Chen,
Jason Anderson,
John Wickerson
,
George A. Constantinides
IEEE Transactions on Computers, Volume: 71, Issue: 4, Pages: 933 - 946
Swansea University Author: Shane Fleming
-
PDF | Accepted Manuscript
Download (1.31MB)
DOI (Published version): 10.1109/tc.2021.3066466
Abstract
High-level synthesis (HLS) is an increasingly popular method for generating hardware from a description written in a software language like C/C++. Traditionally, HLS tools have operated on sequential code, however, in recent years there has been a drive to synthesise multi-threaded code. In this con...
Published in: | IEEE Transactions on Computers |
---|---|
ISSN: | 0018-9340 1557-9956 |
Published: |
Institute of Electrical and Electronics Engineers (IEEE)
2022
|
Online Access: |
Check full text
|
URI: | https://cronfa.swan.ac.uk/Record/cronfa56451 |
first_indexed |
2021-03-16T09:53:44Z |
---|---|
last_indexed |
2025-01-16T14:08:32Z |
id |
cronfa56451 |
recordtype |
SURis |
fullrecord |
<?xml version="1.0"?><rfc1807><datestamp>2025-01-16T10:42:31.0660042</datestamp><bib-version>v2</bib-version><id>56451</id><entry>2021-03-16</entry><title>Efficient Memory Arbitration in High-Level Synthesis From Multi-Threaded Code</title><swanseaauthors><author><sid>fe23ad3ebacc194b4f4c480fdde55b95</sid><firstname>Shane</firstname><surname>Fleming</surname><name>Shane Fleming</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2021-03-16</date><deptcode>MACS</deptcode><abstract>High-level synthesis (HLS) is an increasingly popular method for generating hardware from a description written in a software language like C/C++. Traditionally, HLS tools have operated on sequential code, however, in recent years there has been a drive to synthesise multi-threaded code. In this context, a major challenge facing HLS tools is how to automatically partition memory among parallel threads to fully exploit the bandwidth available on an FPGA device and minimise memory contention. Existingpartitioning approaches require inefficient arbitration circuitry to serialise accesses to each bank because they make conservative assumptions about which threads might access which memory banks. In this article, we design a static analysis that can prove certain memory banks are only accessed by certain threads, and use this analysis to simplify or even remove the arbiters while preserving correctness. We show how this analysis can be implemented using the Microsoft Boogie verifier on top of satisfiability modulo theories (SMT) solver, and propose a tool named EASY using automatic formal verification. Our work supports arbitrary input code with any irregular memory access patterns and indirect array addressing forms. We implement our approach in LLVM and integrate it into the LegUp HLS tool. For a set of typical application benchmarks our results have shown that EASY can achieve 0.13×(avg. 0.43×) of area and 1.64×(avg. 1.28×) of performance compared to the baseline, with little additional compilation time relative to the long time in hardware synthesis.</abstract><type>Journal Article</type><journal>IEEE Transactions on Computers</journal><volume>71</volume><journalNumber>4</journalNumber><paginationStart>933</paginationStart><paginationEnd>946</paginationEnd><publisher>Institute of Electrical and Electronics Engineers (IEEE)</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint>0018-9340</issnPrint><issnElectronic>1557-9956</issnElectronic><keywords/><publishedDay>1</publishedDay><publishedMonth>4</publishedMonth><publishedYear>2022</publishedYear><publishedDate>2022-04-01</publishedDate><doi>10.1109/tc.2021.3066466</doi><url/><notes/><college>COLLEGE NANME</college><department>Mathematics and Computer Science School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MACS</DepartmentCode><institution>Swansea University</institution><apcterm>Not Required</apcterm><funders/><projectreference/><lastEdited>2025-01-16T10:42:31.0660042</lastEdited><Created>2021-03-16T09:43:03.4419052</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Jianyi</firstname><surname>Cheng</surname><orcid>0000-0003-2791-2555</orcid><order>1</order></author><author><firstname>Shane</firstname><surname>Fleming</surname><order>2</order></author><author><firstname>Yu Ting</firstname><surname>Chen</surname><order>3</order></author><author><firstname>Jason</firstname><surname>Anderson</surname><order>4</order></author><author><firstname>John</firstname><surname>Wickerson</surname><orcid>0000-0001-6735-5533</orcid><order>5</order></author><author><firstname>George A.</firstname><surname>Constantinides</surname><orcid>0000-0002-0201-310x</orcid><order>6</order></author></authors><documents><document><filename>56451__19590__b98976cce3e742f28f31716234b1d45c.pdf</filename><originalFilename>56451.pdf</originalFilename><uploaded>2021-03-29T15:30:17.6267430</uploaded><type>Output</type><contentLength>1370792</contentLength><contentType>application/pdf</contentType><version>Accepted Manuscript</version><cronfaStatus>true</cronfaStatus><copyrightCorrect>true</copyrightCorrect><language>eng</language></document></documents><OutputDurs/></rfc1807> |
spelling |
2025-01-16T10:42:31.0660042 v2 56451 2021-03-16 Efficient Memory Arbitration in High-Level Synthesis From Multi-Threaded Code fe23ad3ebacc194b4f4c480fdde55b95 Shane Fleming Shane Fleming true false 2021-03-16 MACS High-level synthesis (HLS) is an increasingly popular method for generating hardware from a description written in a software language like C/C++. Traditionally, HLS tools have operated on sequential code, however, in recent years there has been a drive to synthesise multi-threaded code. In this context, a major challenge facing HLS tools is how to automatically partition memory among parallel threads to fully exploit the bandwidth available on an FPGA device and minimise memory contention. Existingpartitioning approaches require inefficient arbitration circuitry to serialise accesses to each bank because they make conservative assumptions about which threads might access which memory banks. In this article, we design a static analysis that can prove certain memory banks are only accessed by certain threads, and use this analysis to simplify or even remove the arbiters while preserving correctness. We show how this analysis can be implemented using the Microsoft Boogie verifier on top of satisfiability modulo theories (SMT) solver, and propose a tool named EASY using automatic formal verification. Our work supports arbitrary input code with any irregular memory access patterns and indirect array addressing forms. We implement our approach in LLVM and integrate it into the LegUp HLS tool. For a set of typical application benchmarks our results have shown that EASY can achieve 0.13×(avg. 0.43×) of area and 1.64×(avg. 1.28×) of performance compared to the baseline, with little additional compilation time relative to the long time in hardware synthesis. Journal Article IEEE Transactions on Computers 71 4 933 946 Institute of Electrical and Electronics Engineers (IEEE) 0018-9340 1557-9956 1 4 2022 2022-04-01 10.1109/tc.2021.3066466 COLLEGE NANME Mathematics and Computer Science School COLLEGE CODE MACS Swansea University Not Required 2025-01-16T10:42:31.0660042 2021-03-16T09:43:03.4419052 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Jianyi Cheng 0000-0003-2791-2555 1 Shane Fleming 2 Yu Ting Chen 3 Jason Anderson 4 John Wickerson 0000-0001-6735-5533 5 George A. Constantinides 0000-0002-0201-310x 6 56451__19590__b98976cce3e742f28f31716234b1d45c.pdf 56451.pdf 2021-03-29T15:30:17.6267430 Output 1370792 application/pdf Accepted Manuscript true true eng |
title |
Efficient Memory Arbitration in High-Level Synthesis From Multi-Threaded Code |
spellingShingle |
Efficient Memory Arbitration in High-Level Synthesis From Multi-Threaded Code Shane Fleming |
title_short |
Efficient Memory Arbitration in High-Level Synthesis From Multi-Threaded Code |
title_full |
Efficient Memory Arbitration in High-Level Synthesis From Multi-Threaded Code |
title_fullStr |
Efficient Memory Arbitration in High-Level Synthesis From Multi-Threaded Code |
title_full_unstemmed |
Efficient Memory Arbitration in High-Level Synthesis From Multi-Threaded Code |
title_sort |
Efficient Memory Arbitration in High-Level Synthesis From Multi-Threaded Code |
author_id_str_mv |
fe23ad3ebacc194b4f4c480fdde55b95 |
author_id_fullname_str_mv |
fe23ad3ebacc194b4f4c480fdde55b95_***_Shane Fleming |
author |
Shane Fleming |
author2 |
Jianyi Cheng Shane Fleming Yu Ting Chen Jason Anderson John Wickerson George A. Constantinides |
format |
Journal article |
container_title |
IEEE Transactions on Computers |
container_volume |
71 |
container_issue |
4 |
container_start_page |
933 |
publishDate |
2022 |
institution |
Swansea University |
issn |
0018-9340 1557-9956 |
doi_str_mv |
10.1109/tc.2021.3066466 |
publisher |
Institute of Electrical and Electronics Engineers (IEEE) |
college_str |
Faculty of Science and Engineering |
hierarchytype |
|
hierarchy_top_id |
facultyofscienceandengineering |
hierarchy_top_title |
Faculty of Science and Engineering |
hierarchy_parent_id |
facultyofscienceandengineering |
hierarchy_parent_title |
Faculty of Science and Engineering |
department_str |
School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science |
document_store_str |
1 |
active_str |
0 |
description |
High-level synthesis (HLS) is an increasingly popular method for generating hardware from a description written in a software language like C/C++. Traditionally, HLS tools have operated on sequential code, however, in recent years there has been a drive to synthesise multi-threaded code. In this context, a major challenge facing HLS tools is how to automatically partition memory among parallel threads to fully exploit the bandwidth available on an FPGA device and minimise memory contention. Existingpartitioning approaches require inefficient arbitration circuitry to serialise accesses to each bank because they make conservative assumptions about which threads might access which memory banks. In this article, we design a static analysis that can prove certain memory banks are only accessed by certain threads, and use this analysis to simplify or even remove the arbiters while preserving correctness. We show how this analysis can be implemented using the Microsoft Boogie verifier on top of satisfiability modulo theories (SMT) solver, and propose a tool named EASY using automatic formal verification. Our work supports arbitrary input code with any irregular memory access patterns and indirect array addressing forms. We implement our approach in LLVM and integrate it into the LegUp HLS tool. For a set of typical application benchmarks our results have shown that EASY can achieve 0.13×(avg. 0.43×) of area and 1.64×(avg. 1.28×) of performance compared to the baseline, with little additional compilation time relative to the long time in hardware synthesis. |
published_date |
2022-04-01T07:46:57Z |
_version_ |
1827460787690536960 |
score |
11.055049 |