TY - GEN
T1 - Influence of Classification Task and Distribution Shift Type on OOD Detection in Fetal Ultrasound
AU - Wong, Chun Kit
AU - Christensen, Anders N.
AU - Bercea, Cosmin I.
AU - Schnabel, Julia A.
AU - Tolsgaard, Martin G.
AU - Feragen, Aasa
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
PY - 2026
Y1 - 2026
N2 - Reliable out-of-distribution (OOD) detection is important for safe deployment of deep learning models in fetal ultrasound amidst heterogeneous image characteristics and clinical settings. OOD detection relies on estimating a classification model’s uncertainty, which should increase for OOD samples. While existing research has largely focused on uncertainty quantification methods, this work investigates the impact of the classification task itself. Through experiments with eight uncertainty quantification methods across four classification tasks on the same image dataset, we demonstrate that OOD detection performance significantly varies with the task, and that the best task depends on the defined ID-OOD criteria; specifically, whether the OOD sample is due to: i) an image characteristic shift or ii) an anatomical feature shift. Furthermore, we reveal that superior OOD detection does not guarantee optimal abstained prediction, underscoring the necessity to align task selection and uncertainty strategies with the specific downstream application in medical image analysis. Code: https://github.com/wong-ck/ood-fetal-us.
AB - Reliable out-of-distribution (OOD) detection is important for safe deployment of deep learning models in fetal ultrasound amidst heterogeneous image characteristics and clinical settings. OOD detection relies on estimating a classification model’s uncertainty, which should increase for OOD samples. While existing research has largely focused on uncertainty quantification methods, this work investigates the impact of the classification task itself. Through experiments with eight uncertainty quantification methods across four classification tasks on the same image dataset, we demonstrate that OOD detection performance significantly varies with the task, and that the best task depends on the defined ID-OOD criteria; specifically, whether the OOD sample is due to: i) an image characteristic shift or ii) an anatomical feature shift. Furthermore, we reveal that superior OOD detection does not guarantee optimal abstained prediction, underscoring the necessity to align task selection and uncertainty strategies with the specific downstream application in medical image analysis. Code: https://github.com/wong-ck/ood-fetal-us.
KW - OOD
KW - fetal ultrasound
KW - uncertainty quantification
UR - https://www.scopus.com/pages/publications/105017859502
U2 - 10.1007/978-3-032-04981-0_28
DO - 10.1007/978-3-032-04981-0_28
M3 - Conference contribution
AN - SCOPUS:105017859502
SN - 9783032049803
T3 - Lecture Notes in Computer Science
SP - 293
EP - 303
BT - Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - 28th International Conference, Proceedings
A2 - Gee, James C.
A2 - Hong, Jaesung
A2 - Sudre, Carole H.
A2 - Golland, Polina
A2 - Alexander, Daniel C.
A2 - Iglesias, Juan Eugenio
A2 - Venkataraman, Archana
A2 - Kim, Jong Hyo
PB - Springer Science and Business Media Deutschland GmbH
T2 - 28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025
Y2 - 23 September 2025 through 27 September 2025
ER -