HomeEducationDoctorate (PhD & EngD)For current candidatesPhD infoUpcoming public defencesPhD Defence Gabriel Clara | Regularization Through Noise - A Study of Algorithmic Randomness in Gradient Descent Training

PhD Defence Gabriel Clara | Regularization Through Noise - A Study of Algorithmic Randomness in Gradient Descent Training

Regularization Through Noise - A Study of Algorithmic Randomness in Gradient Descent Training

The PhD defence of Clara Gabriel will take place in the Waaier building of the University of Twente and can be followed by a live stream.
Live Stream

Gabriel Clara is a PhD student in the Department of Mathematics of Operations Research. Promotors are prof.dr. A.J. Schmidt-Hieber and dr.rer.nat.S. Langer from the Faculty of Science & Technology.

Machine learning is at the heart of the ongoing artificial intelligence (AI) revolution, which has seen widespread adoption of AI tools in academia, industry, and general society. The rapid rise of AI, both as a buzzword and a product in its own right, has spawned an entire industry of researchers and engineers working towards ever better AI models. While the empirical success of AI is demonstrably already a reality, the mathematical foundations of the underlying machine learning methods are still poorly understood.

Broadly speaking, machine learning models identify and express relationships between patterns found in data. These models are usually trained on a stream of data observations by optimizing some measure of how well the model approximates the patterns present in the observed data. An effective mathematical theory of machine learning should then answer two related questions: (1) How do the models approximate patterns in data? (2) When do the patterns captured by the model in the observed data generalize to previously unseen data? As a mathematical field, machine learning theory lies at the intersection of high-dimensional statistics, approximation theory, and optimisation.

This dissertation investigates algorithmic regularization techniques employed during the training of machine learning models, focusing on the effect of randomness present in the training. This randomness may come from stochastic approximations taken to speed up the training process, such as mini-batch training, or may be added deliberately on top of pre-existing randomness with the goal of achieving a specific regularizing effect. The regularization induced through such choices offers a possible explanation for the effectiveness of large models in solving complex tasks, tying the obtained results to the wider goal of understanding the mathematical foundations of machine learning.