Uncertainty Calibration in Deep Learning: Methods, Emerging Challenges, and LLM Frontiers
-
Abstract
Despite the remarkable breakthroughs in deep neural networks (DNNs), the deployment of deep models in highstakes, safety-critical applications remains a significant challenge. For trustworthy machine learning systems, the ability to provide reliable and well-calibrated uncertainty estimates is fundamental. However, DNNS are notoriously overconfident, often yielding miscalibrated probabilities that hinder their integration into real-world decision-making tasks. This survey provides a comprehensive review of the recent advancements in the field of uncertainty calibration in deep learning. We begin by introducing a range of advanced methods recently proposed for calibrating uncertainty in DNNs. To this end, we organize them into four primary paradigms: train-time regularization, post-hoc adjustments, Bayesian & ensemble neural networks, and hybrid approaches. Then, we transition to emerging research challenges that have gained significant attentions these years, specifically focusing on calibration challenges in out-of-distribution (OOD) scenarios, uncertainty quantification of generative models, calibration in multi-modal learning, and human-AI collaboration setting. Finally, we structure our exploration of uncertainty calibration in large language models (LLMs) around three fundamental research questions: how LLMs express uncertainty, how to evaluate their confidence, and how to effectively calibrate LLMs. By reviewing these diverse perspectives, this paper aims to serve as a holistic roadmap for researchers and practitioners aiming to bridge the gap between predictive performance and model trustworthiness.
-
-