Learn then Test: Calibrating Predictive Algorithms to Achieve Risk Control

About this Item

Title:: Learn then Test: Calibrating Predictive Algorithms to Achieve Risk Control

All Authors:: Angelopoulos, A. N.; Bates, S.; Candès, E. J.; Jordan, M. I.; Lei, Lihua.

Abstract:: The supplementary material document contains proofs of all results as well as additional details on experiments and theory. The code required to reproduce our paper is included at this link: https://github.com/aangelopoulos/ltt/tree/revision, and as a file accompanying this manuscript. We introduce a framework for calibrating machine learning models to satisfy finite-sample statistical guarantees. Our calibration algorithms work with any model and (unknown) data-generating distribution and do not require retraining. The algorithms address, among other examples, false discovery rate control in multilabel classification, intersection-over-union control in instance segmentation, and simultaneous control of the type-1 outlier error and confidence set coverage in classification or regression. Our main insight is to reframe risk control as multiple hypothesis testing, enabling different mathematical arguments. We demonstrate our algorithms with detailed worked examples in computer vision and tabular medical data. The computer vision experiments demonstrate the utility of our approach in calibrating state-of-the-art predictive architectures that have been deployed widely, such as the detectron2 object detection system.

Keywords:: computer vision; conformal prediction; deep learning; machine learning

DOI: https://doi.org/10.1214/24-aoas1998