Speaker
Matteo Favoni
(Swansea University)
Description
Random matrix theory was first examined in the context of nuclear physics to investigate properties of heavy atom nucleus spectra. This theory is suited for an application to machine learning algorithms, specifically to study the properties of their weight matrices.
In this presentation, we report the effect of varying the learning rate and the batch size used in the stochastic gradient descent algorithm and explore the role of hidden layers in teacher-student models, show that the matrix eigenvalues characterizing well-trained models are distributed according to the Wigner’s surmise and Wigner’s semicircle, and discuss preliminary results for neural networks.
Primary authors
Biagio Lucini
(Swansea University)
Chanju Park
(Swansea University)
Gert Aarts
(Swansea University)
Matteo Favoni
(Swansea University)