… and yet they do. 🤖✨
Deep learning relies on highly complex loss functions. Algorithms should optimize them, but we know they can’t. So why are the results still so impressive?
Our new paper offers a mathematical explanation. 📘🧠 We rigorously prove that deep-learning algorithms don’t actually need to find the true optimum. Being close to a local optimum is already enough. In nerdier terms: We show that every reasonable stationary point of certain neural networks — and all points nearby — generalize essentially as well as the global optimum. 🔍📈
The paper has been accepted at TMLR! 🎉 Find it here.
Huge congratulations to Mahsa and Fang—two rising stars in machine learning. 🌟🌟