Dissertation

Efficient tuning of automated machine learning pipelines

Automated Machine Learning (AutoML) is widely used to automatically build a suitable practical Machine Learning (ML) model for an arbitrary real-world problem, reducing the effort of practitioners in the ML development cycle for real-world applications. Optimization is a key part of a typical AutoML framework, and several optimization approaches have been developed to enhance AutoML performance and maximize its ability to find high-performing ML models on a wide range of real-world problems.

Author: D.A. Nguyen
Date: 09 October 2024
Links: Thesis in Leiden Repository

Many AutoML studies treat the problem as a Hyperparameter Optimization (HPO) problem, which may limit the effectiveness of the underlying optimizer for the actual AutoML problem. Instead of using the HPO-based approach, the problem can be approached as a ML pipeline optimization problem with a hierarchically structured search space. Two optimization algorithms have been developed based on this paradigm to improve the performance of Bayesian Optimization (BO) in solving the AutoML optimization problem. A wide range of experiments indicates that our approaches have significantly improved BO's performance. In summary, this thesis focuses on enhancing AutoML performance through solving the ML pipeline optimization problem.