TY - GEN
T1 - DinoBloom
T2 - 27th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2024
AU - Koch, Valentin
AU - Wagner, Sophia J.
AU - Kazeminia, Salome
AU - Sancar, Ece
AU - Hehr, Matthias
AU - Schnabel, Julia A.
AU - Peng, Tingying
AU - Marr, Carsten
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024
Y1 - 2024
N2 - In hematology, computational models offer significant potential to improve diagnostic accuracy, streamline workflows, and reduce the tedious work of analyzing single cells in peripheral blood or bone marrow smears. However, clinical adoption of computational models has been hampered by the lack of generalization due to large batch effects, small dataset sizes, and poor performance in transfer learning from natural images. To address these challenges, we introduce DinoBloom, the first foundation model for single cell images in hematology, utilizing a tailored DINOv2 pipeline. Our model is built upon an extensive collection of 13 diverse, publicly available datasets of peripheral blood and bone marrow smears, the most substantial open-source cohort in hematology so far, comprising over 380,000 white blood cell images. To assess its generalization capability, we evaluate it on an external dataset with a challenging domain shift. We show that our model outperforms existing medical and non-medical vision models in (i) linear probing and k-nearest neighbor evaluations for cell-type classification on blood and bone marrow smears and (ii) weakly supervised multiple instance learning for acute myeloid leukemia subtyping by a large margin. A family of four DinoBloom models (small, base, large, and giant) can be adapted for a wide range of downstream applications, be a strong baseline for classification problems, and facilitate the assessment of batch effects in new datasets. All models are available at github.com/marrlab/DinoBloom.
AB - In hematology, computational models offer significant potential to improve diagnostic accuracy, streamline workflows, and reduce the tedious work of analyzing single cells in peripheral blood or bone marrow smears. However, clinical adoption of computational models has been hampered by the lack of generalization due to large batch effects, small dataset sizes, and poor performance in transfer learning from natural images. To address these challenges, we introduce DinoBloom, the first foundation model for single cell images in hematology, utilizing a tailored DINOv2 pipeline. Our model is built upon an extensive collection of 13 diverse, publicly available datasets of peripheral blood and bone marrow smears, the most substantial open-source cohort in hematology so far, comprising over 380,000 white blood cell images. To assess its generalization capability, we evaluate it on an external dataset with a challenging domain shift. We show that our model outperforms existing medical and non-medical vision models in (i) linear probing and k-nearest neighbor evaluations for cell-type classification on blood and bone marrow smears and (ii) weakly supervised multiple instance learning for acute myeloid leukemia subtyping by a large margin. A family of four DinoBloom models (small, base, large, and giant) can be adapted for a wide range of downstream applications, be a strong baseline for classification problems, and facilitate the assessment of batch effects in new datasets. All models are available at github.com/marrlab/DinoBloom.
KW - Foundation model
KW - Hematology
KW - Self-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85208199998&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-72390-2_49
DO - 10.1007/978-3-031-72390-2_49
M3 - Conference contribution
AN - SCOPUS:85208199998
SN - 9783031723896
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 520
EP - 530
BT - Medical Image Computing and Computer Assisted Intervention – MICCAI 2024 - 27th International Conference, Proceedings
A2 - Linguraru, Marius George
A2 - Dou, Qi
A2 - Feragen, Aasa
A2 - Giannarou, Stamatia
A2 - Glocker, Ben
A2 - Lekadir, Karim
A2 - Schnabel, Julia A.
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 6 October 2024 through 10 October 2024
ER -