Deep-learning algorithms don’t work…

… and yet they do. πŸ€–βœ¨

Deep learning relies on highly complex loss functions. Algorithms should optimize them, but we know they can’t. So why are the results still so impressive?

Our new paper offers a mathematical explanation. πŸ“˜πŸ§  We rigorously prove that deep-learning algorithms don’t actually need to find the true optimum. Being close to a local optimum is already enough. In nerdier terms: We show that every reasonable stationary point of certain neural networks β€” and all points nearby β€” generalize essentially as well as the global optimum. πŸ”πŸ“ˆ

The paper has been accepted at TMLR! πŸŽ‰ Find it here.

Huge congratulations to Mahsa and Fangβ€”two rising stars in machine learning. 🌟🌟

A New Type of Sparsity for More Efficient Matrix-Matrix Multiplications

We all love sparsity: it makes computations faster, guarantees tighter, and interpretations easier. In our paper, , which will appear in TMLR, we introduce a new type of sparsity, which we term “cardinality sparsity”. We show that cardinality sparsity has all the usual perks, and more importantly, we demonstrate that it is also a very powerful concept for matrix-matrix multiplications. Indeed, cardinality sparsity can speed up such computations and reduce memory usage dramatically. Well done, Ali! πŸ‘πŸ‘πŸ‘

AI Is Hungry for Data

How many samples are needed to train a deep neural network? Our recent paper explores this question and comes to the conclusion that: it takes many, many samples. The paper will appear in ICLR 2025. Congratulations to Pegah and Mahsa, who did a wonderful job on this! 🍺🍺🍺