Limitations of State-of-the-Art and a New Principled Framework for HPO and Algorithm Selection

Motivation

Popular practical approaches for effective hyperparameter optimization include Bayesian optimization and bandit-based approaches come with limited theoretical guarantees. Bayesian optimization, often using Gaussian processes, is effective for expensive evaluations but struggles with high-dimensional spaces and the performance is highly sensitive to the choice of priors and internal parameters. Bandit-based approaches make additional assumptions that capture aspects specific to hyperparameter tuning, including fixed limiting values of arm rewards (Hyperband) or increasing pull-dependent rewards with diminishing returns (rising bandits). A major blind spot in effectively using these techniques is the lack of insights on how the algorithmic performance actually varies with the hyperparameter.

A recent line of theoretically grounded work elevates hyperparameter optimization and algorithm selection to a learning problem in its own right. A growing body of research over the past decade from the learning theory community has successfully analysed how to provably tune several fundamental algorithms including decision trees, linear regression, and very recently even deep learning. The new techniques apply naturally to both hyperparameter tuning and algorithm selection. Future research areas include integration of these structure-aware principled approaches with the currently used techniques, better optimization in high-dimensional and discrete spaces, and improving scalability in distributed settings.

Program Outline

Hyperparameter optimization, its significance, and overview of historically popular approaches.
Introduction to major techniques used in practice, the guarantees that are known for them, and their major limitations.
Bayesian Optimization: Gaussian process regression, design of kernels and acquisition functions
Bandit-based methods (e.g. Hyperband, rising bandits)
Theoretically-principled techniques for hyperparameter tuning and algorithm selection using data-driven algorithm design.
A general useful analytical tool
Case study: decision tree algorithms and tuning hyperparameters in tree-based models
Case study: tuning activation function hyperparameters and learning rates in deep neural networks
Summary, promising future research directions, questions/discussion.

Speaker

Dravyansh Sharma

TTIC (Toyota Technological Institute at Chicago)

Dravyansh (Dravy) Sharma is an IDEAL postdoctoral researcher, hosted by Avrim Blum at TTIC and Aravindan Vijayaraghavan at Northwestern University. He obtained his PhD at Carnegie Mellon University, advised by Nina Balcan. His research interests include machine learning theory and algorithms, with a focus on provable hyperparameter tuning and has 10+ research papers related to Hyperparameter Tuning and Algorithm Selection published at top ML and AI venues (5 at NeurIPS, including an Oral). His work develops principled techniques for tuning and selecting fundamental machine learning algorithms, including decision trees, linear regression, graph-based learning and, most recently, deep networks. He has published several papers at top ML venues, including NeurIPS, ICML, COLT, JMLR, AISTATS, UAI and AAAI, has multiple papers awarded with Oral presentations, won the Outstanding Student Paper Award at UAI 2024, and has interned with Google Research and Microsoft Research. He has given a tutorial on the topic at UAI 2025 and has an accepted tutorial at NeurIPS 2025.