No Cover Image

Conference Paper/Proceeding/Abstract 52 views 12 downloads

Random Matrix Theory for Stochastic Gradient Descent

Chanju Park, Matteo Favoni, Biagio Lucini, Gert Aarts Orcid Logo

Proceedings of The 41st International Symposium on Lattice Field Theory — PoS(LATTICE2024), Volume: 466, Start page: 031

Swansea University Authors: Matteo Favoni, Gert Aarts Orcid Logo

  • 70970.VoR.pdf

    PDF | Version of Record

    ©Copyright owned by the author(s) under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0).

    Download (954.7KB)

Check full text

DOI (Published version): 10.22323/1.466.0031

Abstract

Investigating the dynamics of learning in machine learning algorithms is of paramount importance for understanding how and why an approach may be successful. The tools of physics and statistics provide a robust setting for such investigations. Here, we apply concepts from random matrix theory to des...

Full description

Published in: Proceedings of The 41st International Symposium on Lattice Field Theory — PoS(LATTICE2024)
ISSN: 1824-8039
Published: Trieste, Italy Sissa Medialab 2025
Online Access: Check full text

URI: https://cronfa.swan.ac.uk/Record/cronfa70970
first_indexed 2025-11-22T22:01:17Z
last_indexed 2026-01-17T05:33:03Z
id cronfa70970
recordtype SURis
fullrecord <?xml version="1.0"?><rfc1807><datestamp>2026-01-16T10:09:14.6057637</datestamp><bib-version>v2</bib-version><id>70970</id><entry>2025-11-22</entry><title>Random Matrix Theory for Stochastic Gradient Descent</title><swanseaauthors><author><sid>a58c87c7bef8cfa5aeb21c6b73ef6309</sid><firstname>Matteo</firstname><surname>Favoni</surname><name>Matteo Favoni</name><active>true</active><ethesisStudent>false</ethesisStudent></author><author><sid>1ba0dad382dfe18348ec32fc65f3f3de</sid><ORCID>0000-0002-6038-3782</ORCID><firstname>Gert</firstname><surname>Aarts</surname><name>Gert Aarts</name><active>true</active><ethesisStudent>false</ethesisStudent></author></swanseaauthors><date>2025-11-22</date><deptcode>BGPS</deptcode><abstract>Investigating the dynamics of learning in machine learning algorithms is of paramount importance for understanding how and why an approach may be successful. The tools of physics and statistics provide a robust setting for such investigations. Here, we apply concepts from random matrix theory to describe stochastic weight matrix dynamics, using the framework of Dyson Brownian motion. We derive the linear scaling rule between the learning rate (step size) and the batch size, and identify universal and non-universal aspects of weight matrix dynamics. We test our findings in the (near-)solvable case of the Gaussian Restricted Boltzmann Machine and in a linear one-hidden-layer neural network.</abstract><type>Conference Paper/Proceeding/Abstract</type><journal>Proceedings of The 41st International Symposium on Lattice Field Theory &#x2014; PoS(LATTICE2024)</journal><volume>466</volume><journalNumber/><paginationStart>031</paginationStart><paginationEnd/><publisher>Sissa Medialab</publisher><placeOfPublication>Trieste, Italy</placeOfPublication><isbnPrint/><isbnElectronic/><issnPrint/><issnElectronic>1824-8039</issnElectronic><keywords/><publishedDay>18</publishedDay><publishedMonth>12</publishedMonth><publishedYear>2025</publishedYear><publishedDate>2025-12-18</publishedDate><doi>10.22323/1.466.0031</doi><url/><notes/><college>COLLEGE NANME</college><department>Biosciences Geography and Physics School</department><CollegeCode>COLLEGE CODE</CollegeCode><DepartmentCode>BGPS</DepartmentCode><institution>Swansea University</institution><apcterm/><funders>GA, MF and BL are supported by STFC Consolidated Grant ST/X000648/1. BL is further supported by the UKRI EPSRC ExCALIBUR ExaTEPP project EP/X017168/1. CP is supported by the UKRI AIMLAC CDT EP/S023992/1.</funders><projectreference/><lastEdited>2026-01-16T10:09:14.6057637</lastEdited><Created>2025-11-22T16:55:53.3177987</Created><path><level id="1">Faculty of Science and Engineering</level><level id="2">School of Biosciences, Geography and Physics - Physics</level></path><authors><author><firstname>Chanju</firstname><surname>Park</surname><order>1</order></author><author><firstname>Matteo</firstname><surname>Favoni</surname><order>2</order></author><author><firstname>Biagio</firstname><surname>Lucini</surname><order>3</order></author><author><firstname>Gert</firstname><surname>Aarts</surname><orcid>0000-0002-6038-3782</orcid><order>4</order></author></authors><documents><document><filename>70970__36017__a0d36dc233f3445884a719619f7b8cd9.pdf</filename><originalFilename>70970.VoR.pdf</originalFilename><uploaded>2026-01-16T10:06:31.0782510</uploaded><type>Output</type><contentLength>977616</contentLength><contentType>application/pdf</contentType><version>Version of Record</version><cronfaStatus>true</cronfaStatus><documentNotes>&#xA9;Copyright owned by the author(s) under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0).</documentNotes><copyrightCorrect>true</copyrightCorrect><language>eng</language><licence>https://creativecommons.org/licenses/by-nc-nd/4.0/deed.en</licence></document></documents><OutputDurs/></rfc1807>
spelling 2026-01-16T10:09:14.6057637 v2 70970 2025-11-22 Random Matrix Theory for Stochastic Gradient Descent a58c87c7bef8cfa5aeb21c6b73ef6309 Matteo Favoni Matteo Favoni true false 1ba0dad382dfe18348ec32fc65f3f3de 0000-0002-6038-3782 Gert Aarts Gert Aarts true false 2025-11-22 BGPS Investigating the dynamics of learning in machine learning algorithms is of paramount importance for understanding how and why an approach may be successful. The tools of physics and statistics provide a robust setting for such investigations. Here, we apply concepts from random matrix theory to describe stochastic weight matrix dynamics, using the framework of Dyson Brownian motion. We derive the linear scaling rule between the learning rate (step size) and the batch size, and identify universal and non-universal aspects of weight matrix dynamics. We test our findings in the (near-)solvable case of the Gaussian Restricted Boltzmann Machine and in a linear one-hidden-layer neural network. Conference Paper/Proceeding/Abstract Proceedings of The 41st International Symposium on Lattice Field Theory — PoS(LATTICE2024) 466 031 Sissa Medialab Trieste, Italy 1824-8039 18 12 2025 2025-12-18 10.22323/1.466.0031 COLLEGE NANME Biosciences Geography and Physics School COLLEGE CODE BGPS Swansea University GA, MF and BL are supported by STFC Consolidated Grant ST/X000648/1. BL is further supported by the UKRI EPSRC ExCALIBUR ExaTEPP project EP/X017168/1. CP is supported by the UKRI AIMLAC CDT EP/S023992/1. 2026-01-16T10:09:14.6057637 2025-11-22T16:55:53.3177987 Faculty of Science and Engineering School of Biosciences, Geography and Physics - Physics Chanju Park 1 Matteo Favoni 2 Biagio Lucini 3 Gert Aarts 0000-0002-6038-3782 4 70970__36017__a0d36dc233f3445884a719619f7b8cd9.pdf 70970.VoR.pdf 2026-01-16T10:06:31.0782510 Output 977616 application/pdf Version of Record true ©Copyright owned by the author(s) under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0). true eng https://creativecommons.org/licenses/by-nc-nd/4.0/deed.en
title Random Matrix Theory for Stochastic Gradient Descent
spellingShingle Random Matrix Theory for Stochastic Gradient Descent
Matteo Favoni
Gert Aarts
title_short Random Matrix Theory for Stochastic Gradient Descent
title_full Random Matrix Theory for Stochastic Gradient Descent
title_fullStr Random Matrix Theory for Stochastic Gradient Descent
title_full_unstemmed Random Matrix Theory for Stochastic Gradient Descent
title_sort Random Matrix Theory for Stochastic Gradient Descent
author_id_str_mv a58c87c7bef8cfa5aeb21c6b73ef6309
1ba0dad382dfe18348ec32fc65f3f3de
author_id_fullname_str_mv a58c87c7bef8cfa5aeb21c6b73ef6309_***_Matteo Favoni
1ba0dad382dfe18348ec32fc65f3f3de_***_Gert Aarts
author Matteo Favoni
Gert Aarts
author2 Chanju Park
Matteo Favoni
Biagio Lucini
Gert Aarts
format Conference Paper/Proceeding/Abstract
container_title Proceedings of The 41st International Symposium on Lattice Field Theory — PoS(LATTICE2024)
container_volume 466
container_start_page 031
publishDate 2025
institution Swansea University
issn 1824-8039
doi_str_mv 10.22323/1.466.0031
publisher Sissa Medialab
college_str Faculty of Science and Engineering
hierarchytype
hierarchy_top_id facultyofscienceandengineering
hierarchy_top_title Faculty of Science and Engineering
hierarchy_parent_id facultyofscienceandengineering
hierarchy_parent_title Faculty of Science and Engineering
department_str School of Biosciences, Geography and Physics - Physics{{{_:::_}}}Faculty of Science and Engineering{{{_:::_}}}School of Biosciences, Geography and Physics - Physics
document_store_str 1
active_str 0
description Investigating the dynamics of learning in machine learning algorithms is of paramount importance for understanding how and why an approach may be successful. The tools of physics and statistics provide a robust setting for such investigations. Here, we apply concepts from random matrix theory to describe stochastic weight matrix dynamics, using the framework of Dyson Brownian motion. We derive the linear scaling rule between the learning rate (step size) and the batch size, and identify universal and non-universal aspects of weight matrix dynamics. We test our findings in the (near-)solvable case of the Gaussian Restricted Boltzmann Machine and in a linear one-hidden-layer neural network.
published_date 2025-12-18T05:34:05Z
_version_ 1856987040041664512
score 11.096295