Uncertainty Calibration in Deep Learning: Methods, Emerging Challenges, and LLM Frontiers
-
Abstract
Despite the remarkable breakthroughs in deep neural networks (DNNs), the deployment of deep models in high-stakes, safety-critical applications remains a significant challenge. For trustworthy machine learning systems, the ability to provide reliable and well-calibrated uncertainty estimates is fundamental. However, DNNs are notoriously overconfident, often yielding miscalibrated probabilities that hinder their integration into real-world decision-making tasks. This survey provides a comprehensive review of the recent advancements in the field of uncertainty calibration in deep learning. It begins by introducing a range of advanced methods recently proposed for calibrating uncertainty in DNNs. To this end, we organize them into four primary paradigms: train-time regularization, post-hoc adjustments, Bayesian and ensemble neural networks, and hybrid approaches. Then, we transition to emerging research challenges that have gained significant attention in recent years, specifically focusing on calibration challenges in out-of-distribution (OOD) scenarios, uncertainty quantification of generative models, calibration in multimodal learning, and human-AI collaboration setting. Finally, we structure our exploration of uncertainty calibration in large language models (LLMs) around three fundamental research questions: how LLMs express uncertainty, how to evaluate their confidence, and how to effectively calibrate LLMs. By reviewing these diverse perspectives, this paper aims to serve as a holistic roadmap for researchers and practitioners aiming to bridge the gap between predictive performance and model trustworthiness.
-
-