报告人:Maximilian Engel (Freie Universität Berlin)
时间:2026-04-29 15:00–16:00
地点:online(Tencent ID:489-202-157)
Abstract:
I will summarize recent findings on how to use notions and methods from random dynamical systems and ergodic theory to investigate two important problems in deep neural networks. Firstly, in a joint work with Dennis Chemnitz (now ETH), we characterize global minima of overparameterized training tasks as dynamically stable/unstable for (stochastic) gradient descent, short (S)GD, via their respective characteristic Lyapunov exponents. We rigorously prove that the sign of this Lyapunov exponent determines whether (S)GD can accumulate at the respective global minimum, relating to the question of generalization of such parameter solutions. Secondly, in a joint work with Anna Shalova (UvA), we introduce the Random Quadratic Form, as motivated by the study of the role of linear layers in transformers, proving synchronization by common noise for such simplified models. In particular, we provide an alternative (independent of self-attention) explanation of the clustering behaviour in deep transformers.
Bio:
Maximilian Engel received his PhD in 2018 under the supervision of Jeroen Lamb and Martin Rasmussen at Imperial College London, focusing on Random Dynamical Systems and Stochastic Bifurcation Theory. After a postdoctoral fellowship on Multiscale Dynamical Systems with Christian Kuehn at TU Munich, from 2018-2020, he became a junior research group leader at FU Berlin in 2020. His research group has been focussing on Random and Multiscale Dynamics, extending also to applications in Atmospheric Dynamics, Interacting Systems and Deep Learning. In 2024 he additionally joined the University of Amsterdam as an assistant professor in Analysis and Probability where he now holds a tenured position with a new group studying Transient Random Dynamics under an NWO Vidi grant.
Join Tencent Meeting
//meeting.tencent.com/dm/b0P66OMHfij1
Meeting ID: 489-202-157
