Featured image of post An Information Theoretic Framework for Deep Learning

An Information Theoretic Framework for Deep Learning

Hong Jun Jeon; Benjamin Van Roy (Neurips 2022)

Authors

Hong Jun Jeon
Department of Computer Science,
Stanford University

Benjamin Van Roy
Department of Electrical Engineering,
Department of Management Sciences and Engineering,
Stanford University

Abstract

Each year, deep learning demonstrate new and improved empirical results with deeper and wider neural networks. Meanwhile, with existing theoretical frameworks, it is difficult to analyze networks deeper than two layers without resorting to counting parameters or encountering sample complexity bounds that are exponential in depth. Perhaps it may be fruitful to try to analyze modern machine learning under a different lens. In this paper, we propose a novel information-theoretic framework with its own notions of regret and sample complexity for analyzing the data requirements of machine learning. We use this framework to study the sample complexity of learning from data generated by deep ReLU neural networks and deep networks that are infinitely wide but have a bounded sum of weights. We establish that the sample complexity of learning under these data generating processes is at most linear and quadratic, respectively, in network depth.

Reference

Neurips Link

Submission

Accepted at Neurips 2022


Theme Stack designed by Jimmy