You are here

Statistics Seminar: Arindam Banerjee

Geometry and Generalization in Deep Learning
November 12, 2020 - 10:30am

 

Speaker: Arindam Banerjee

Title: Geometry and Generalization in Deep Learning

Abstract: The past decade has seen unprecedented empirical success of machine learning based on over-parameterized deep learning models. In this talk, we will first empirically study the geometry of gradients and Hessians of such models and make two key empirical observations: (a) during training, only a small number of gradient components have non-trivial values implying that the learning happens effectively in a subspace; and (b) after convergence, only a small number of Hessian diagonal elements have non-trivial values implying the learned model is insensitive ('flat') w.r.t. most of the parameters.

Utilizing such geometry, we then investigate the following question: given a trained deep network, e.g., ReLU-net, Resnet, etc., how can we characterize its generalization performance?  We first show that high probability margin bounds can be established for deterministic smooth predictors by derandomizing the classical PAC-Bayes bound for Bayesian predictors. Further, we show that similar bounds can be established for non-smooth predictors, e.g., ReLU-nets, by a careful reduction to the smooth case. We will also present experimental results illustrating the effectiveness of these bounds with changing training set sizes and problem difficulty characterized by number of random labels.

Portrait of Arindam Banerjee

Bio: Arindam Banerjee is a Professor at the Department of Computer & Engineering and a Resident Fellow at the Institute on the Environment at the University of Minnesota, Twin Cities. His research interests are in machine learning and data mining, especially on problems involving geometry and randomness. His current research focuses on computational and statistical aspects of deep learning, and sequential decision making problems. His work also focuses on applications in complex real-world problems in different areas including climate science, ecology, recommendation systems, and finance, among others. He has won several awards, including the NSF CAREER award (2010), the IBM Faculty Award (2013), and six best paper awards in top-tier venues.

A virtual social will take place from 11:30-12:00, following this seminar. To join, visit z.umn.edu/STATSeminarSocial
NOTE: You will need to use a Chrome or Firefox browser to access this virtual platform. The link may not open if you sign in using a mobile device or tablet. The virtual space will not open until approximately 11:30.