Journal article 706 views 40 downloads
Traffic signal control using reinforcement learning based on the teacher-student framework
Expert Systems with Applications, Volume: 228, Start page: 120458
Swansea University Author: Scott Yang
-
PDF | Accepted Manuscript
Download (669.31KB)
DOI (Published version): 10.1016/j.eswa.2023.120458
Abstract
Reinforcement Learning (RL) is an effective method for adaptive traffic signals control. As one type of RL, the teacher-student framework has been found helpful in improving the model performance for different application fields (such as robot control, game, hybrid intelligence), but it is rarely ap...
Published in: | Expert Systems with Applications |
---|---|
ISSN: | 0957-4174 |
Published: |
Elsevier BV
2023
|
Online Access: |
Check full text
|
URI: | https://cronfa.swan.ac.uk/Record/cronfa63462 |
first_indexed |
2023-05-16T09:20:48Z |
---|---|
last_indexed |
2024-11-15T18:01:38Z |
id |
cronfa63462 |
recordtype |
SURis |
fullrecord |
<?xml version="1.0"?><rfc1807><datestamp>2024-07-29T14:25:27.7114512</datestamp><bib-version>v2</bib-version><id>63462</id><entry>2023-05-16</entry><title>Traffic signal control using reinforcement learning based on the teacher-student framework</title><swanseaauthors><author><sid>81dc663ca0e68c60908d35b1d2ec3a9b</sid><ORCID>0000-0002-6618-7483</ORCID><firstname>Scott</firstname><surname>Yang</surname><name>Scott Yang</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2023-05-16</date><deptcode>MACS</deptcode><abstract>Reinforcement Learning (RL) is an effective method for adaptive traffic signals control. As one type of RL, the teacher-student framework has been found helpful in improving the model performance for different application fields (such as robot control, game, hybrid intelligence), but it is rarely applied for traffic control due to that the hyper-parameters and the number of state-action pairs experienced are difficult to determine. In this work, the teacher-student framework is used for traffic signal control, where only a single reward function is designed to guide the student agent and by using this method the number of hyper-parameters and the model complexity are reduced. Specifically, the teacher agent uses an importance function to evaluate and guide the student, where the importance function combines with environment reward to form a synthetic reward for the student agent. Experimental results under different traffic environments show that the proposed method achieves the expected performance enhancement and is better than most of the state-of-the-art RL-based traffic signal control methods.</abstract><type>Journal Article</type><journal>Expert Systems with Applications</journal><volume>228</volume><journalNumber/><paginationStart>120458</paginationStart><paginationEnd/><publisher>Elsevier BV</publisher><placeOfPublication/><isbnPrint/><isbnElectronic/><issnPrint>0957-4174</issnPrint><issnElectronic/><keywords/><publishedDay>15</publishedDay><publishedMonth>10</publishedMonth><publishedYear>2023</publishedYear><publishedDate>2023-10-15</publishedDate><doi>10.1016/j.eswa.2023.120458</doi><url/><notes/><college>COLLEGE NANME</college><department>Mathematics and Computer Science School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>MACS</DepartmentCode><institution>Swansea University</institution><apcterm/><funders>This research is supported by the National Natural Science Foundation of China under Grant 61976063, the Guangxi Natural Science Foundation under Grant 2022GXNSFFA035028, research fund of Guangxi Normal University under Grant 2021JC006, the AI+Education research project of Guangxi Humanities Society Science Development Research Center under Grant ZXZJ202205.</funders><projectreference/><lastEdited>2024-07-29T14:25:27.7114512</lastEdited><Created>2023-05-16T10:18:56.1178393</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Mathematics and Computer Science - Computer Science</level></path><authors><author><firstname>Junxiu</firstname><surname>Liu</surname><order>1</order></author><author><firstname>Sheng</firstname><surname>Qin</surname><orcid>0000-0001-7348-901x</orcid><order>2</order></author><author><firstname>Min</firstname><surname>Su</surname><order>3</order></author><author><firstname>Yuling</firstname><surname>Luo</surname><orcid>0000-0002-0117-4614</orcid><order>4</order></author><author><firstname>Shunsheng</firstname><surname>Zhang</surname><order>5</order></author><author><firstname>Yanhu</firstname><surname>Wang</surname><order>6</order></author><author><firstname>Scott</firstname><surname>Yang</surname><orcid>0000-0002-6618-7483</orcid><order>7</order></author></authors><documents><document><filename>63462__27484__46f45335580f42078d1293f83bdd79da.pdf</filename><originalFilename>63462.pdf</originalFilename><uploaded>2023-05-16T12:00:15.9943888</uploaded><type>Output</type><contentLength>685373</contentLength><contentType>application/pdf</contentType><version>Accepted Manuscript</version><cronfaStatus>true</cronfaStatus><embargoDate>2024-05-12T00:00:00.0000000</embargoDate><copyrightCorrect>true</copyrightCorrect><language>eng</language></document></documents><OutputDurs/></rfc1807> |
spelling |
2024-07-29T14:25:27.7114512 v2 63462 2023-05-16 Traffic signal control using reinforcement learning based on the teacher-student framework 81dc663ca0e68c60908d35b1d2ec3a9b 0000-0002-6618-7483 Scott Yang Scott Yang true false 2023-05-16 MACS Reinforcement Learning (RL) is an effective method for adaptive traffic signals control. As one type of RL, the teacher-student framework has been found helpful in improving the model performance for different application fields (such as robot control, game, hybrid intelligence), but it is rarely applied for traffic control due to that the hyper-parameters and the number of state-action pairs experienced are difficult to determine. In this work, the teacher-student framework is used for traffic signal control, where only a single reward function is designed to guide the student agent and by using this method the number of hyper-parameters and the model complexity are reduced. Specifically, the teacher agent uses an importance function to evaluate and guide the student, where the importance function combines with environment reward to form a synthetic reward for the student agent. Experimental results under different traffic environments show that the proposed method achieves the expected performance enhancement and is better than most of the state-of-the-art RL-based traffic signal control methods. Journal Article Expert Systems with Applications 228 120458 Elsevier BV 0957-4174 15 10 2023 2023-10-15 10.1016/j.eswa.2023.120458 COLLEGE NANME Mathematics and Computer Science School COLLEGE CODE MACS Swansea University This research is supported by the National Natural Science Foundation of China under Grant 61976063, the Guangxi Natural Science Foundation under Grant 2022GXNSFFA035028, research fund of Guangxi Normal University under Grant 2021JC006, the AI+Education research project of Guangxi Humanities Society Science Development Research Center under Grant ZXZJ202205. 2024-07-29T14:25:27.7114512 2023-05-16T10:18:56.1178393 Faculty of Science and Engineering School of Mathematics and Computer Science - Computer Science Junxiu Liu 1 Sheng Qin 0000-0001-7348-901x 2 Min Su 3 Yuling Luo 0000-0002-0117-4614 4 Shunsheng Zhang 5 Yanhu Wang 6 Scott Yang 0000-0002-6618-7483 7 63462__27484__46f45335580f42078d1293f83bdd79da.pdf 63462.pdf 2023-05-16T12:00:15.9943888 Output 685373 application/pdf Accepted Manuscript true 2024-05-12T00:00:00.0000000 true eng |
title |
Traffic signal control using reinforcement learning based on the teacher-student framework |
spellingShingle |
Traffic signal control using reinforcement learning based on the teacher-student framework Scott Yang |
title_short |
Traffic signal control using reinforcement learning based on the teacher-student framework |
title_full |
Traffic signal control using reinforcement learning based on the teacher-student framework |
title_fullStr |
Traffic signal control using reinforcement learning based on the teacher-student framework |
title_full_unstemmed |
Traffic signal control using reinforcement learning based on the teacher-student framework |
title_sort |
Traffic signal control using reinforcement learning based on the teacher-student framework |
author_id_str_mv |
81dc663ca0e68c60908d35b1d2ec3a9b |
author_id_fullname_str_mv |
81dc663ca0e68c60908d35b1d2ec3a9b_***_Scott Yang |
author |
Scott Yang |
author2 |
Junxiu Liu Sheng Qin Min Su Yuling Luo Shunsheng Zhang Yanhu Wang Scott Yang |
format |
Journal article |
container_title |
Expert Systems with Applications |
container_volume |
228 |
container_start_page |
120458 |
publishDate |
2023 |
institution |
Swansea University |
issn |
0957-4174 |
doi_str_mv |
10.1016/j.eswa.2023.120458 |
publisher |
Elsevier BV |
college_str |
Faculty of Science and Engineering |
hierarchytype |
|
hierarchy_top_id |
facultyofscienceandengineering |
hierarchy_top_title |
Faculty of Science and Engineering |
hierarchy_parent_id |
facultyofscienceandengineering |
hierarchy_parent_title |
Faculty of Science and Engineering |
department_str |
School of Mathematics and Computer Science - Computer Science{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Mathematics and Computer Science - Computer Science |
document_store_str |
1 |
active_str |
0 |
description |
Reinforcement Learning (RL) is an effective method for adaptive traffic signals control. As one type of RL, the teacher-student framework has been found helpful in improving the model performance for different application fields (such as robot control, game, hybrid intelligence), but it is rarely applied for traffic control due to that the hyper-parameters and the number of state-action pairs experienced are difficult to determine. In this work, the teacher-student framework is used for traffic signal control, where only a single reward function is designed to guide the student agent and by using this method the number of hyper-parameters and the model complexity are reduced. Specifically, the teacher agent uses an importance function to evaluate and guide the student, where the importance function combines with environment reward to form a synthetic reward for the student agent. Experimental results under different traffic environments show that the proposed method achieves the expected performance enhancement and is better than most of the state-of-the-art RL-based traffic signal control methods. |
published_date |
2023-10-15T20:22:16Z |
_version_ |
1821347714948923392 |
score |
11.04748 |