Hong Jun Jeon: Research

Developing Coherent Frameworks for Machine Learning

My research focuses on developing coherent frameworks for reasoning about the various puzzling phenomena of modern machine learning. In recent years I have explored how we can leverage the tools of information theory and Bayesian statistics to arrive at plaussible explanations (rooted in rigorous mathematics) for the behavior of deep learning models. The hope is that this work can help to demystify the "alchemy" of machine learning and establish truths from which we can coherently reason. The long-term vision is to leverage these novel insights to develop new algorithmic ideas across a wide range of machine learning problems.

Recommended reading

Papers listed below introduce facets of my research.

Information-theoretic foundations

Information theory offers elegant tools for analysis of machine learning. These tools accomodate great generality, handling models that are parametric or nonparametric and that involve discrete or continuous variables that are noisy or noiseless, as well as problems of sequential decision and learning. The following monograph provides a comprehensive overview of information theory for characterizing the fundamental limits of machine learning.

Information-Theoretic Foundations for Machine Learning

Information theory can inform the design and analysis of continual learning:

Information theory for LLMs

The astonishing capabilities of large language models (LLMs) have been accompanied by a host of puzzling phenomena. The following papers analyze two of these phenomena (in-context learning and neural scaling laws) and offer elegant results rooted in information theory:

Learning from human feedback

Throughout my research career I have been interested in the problem of learning from human feedback. This has become an increasingly important step in the deployment of machine learning systems which are aligned with human values. Learning from human interactions is a challenging problem and involves a plethora of modalities i.e. demonstrations, preferences, natural language, etc. This paper offers a unifying Bayesian framework for learning from all such modalities.

Reward-rational (implicit) choice: A unifying formalism for reward learning

Learning from human feedback is also expensive and time-prohibitive. As a result, it is important to develop algorithms which extract the most information from the human interactions. This paper offers a novel algorithmic approach to implicitly learning the skills and correlation between human crowd workers:

Adaptive Crowdsourcing Via Self-Supervised Learning

Most recently, I have considered how an agent ought to operate in a setting in which it simultaneously does not know aspects about the environment and the reward with respect to which it will be evaluated. Such is the case in the Alignment problem and the following paper provides some interesting insights and challenges that arise as the agent balances prioritizing short-term reward, exploration of the environment, and exploration of human preferences:

Aligning AI Agents via Information-Directed Sampling