TY - JOUR
T1 - Metadata-enhanced contrastive learning from retinal optical coherence tomography images
AU - PINNACLE Consortium
AU - Holland, Robbie
AU - Leingang, Oliver
AU - Bogunović, Hrvoje
AU - Riedl, Sophie
AU - Fritsche, Lars
AU - Prevost, Toby
AU - Scholl, Hendrik P.N.
AU - Schmidt-Erfurth, Ursula
AU - Sivaprasad, Sobha
AU - Lotery, Andrew J.
AU - Rueckert, Daniel
AU - Menten, Martin J.
N1 - Publisher Copyright:
© 2024 The Author(s)
PY - 2024/10
Y1 - 2024/10
N2 - Deep learning has potential to automate screening, monitoring and grading of disease in medical images. Pretraining with contrastive learning enables models to extract robust and generalisable features from natural image datasets, facilitating label-efficient downstream image analysis. However, the direct application of conventional contrastive methods to medical datasets introduces two domain-specific issues. Firstly, several image transformations which have been shown to be crucial for effective contrastive learning do not translate from the natural image to the medical image domain. Secondly, the assumption made by conventional methods, that any two images are dissimilar, is systematically misleading in medical datasets depicting the same anatomy and disease. This is exacerbated in longitudinal image datasets that repeatedly image the same patient cohort to monitor their disease progression over time. In this paper we tackle these issues by extending conventional contrastive frameworks with a novel metadata-enhanced strategy. Our approach employs widely available patient metadata to approximate the true set of inter-image contrastive relationships. To this end we employ records for patient identity, eye position (i.e. left or right) and time series information. In experiments using two large longitudinal datasets containing 170,427 retinal optical coherence tomography (OCT) images of 7912 patients with age-related macular degeneration (AMD), we evaluate the utility of using metadata to incorporate the temporal dynamics of disease progression into pretraining. Our metadata-enhanced approach outperforms both standard contrastive methods and a retinal image foundation model in five out of six image-level downstream tasks related to AMD. We find benefits in both a low-data and high-data regime across tasks ranging from AMD stage and type classification to prediction of visual acuity. Due to its modularity, our method can be quickly and cost-effectively tested to establish the potential benefits of including available metadata in contrastive pretraining.
AB - Deep learning has potential to automate screening, monitoring and grading of disease in medical images. Pretraining with contrastive learning enables models to extract robust and generalisable features from natural image datasets, facilitating label-efficient downstream image analysis. However, the direct application of conventional contrastive methods to medical datasets introduces two domain-specific issues. Firstly, several image transformations which have been shown to be crucial for effective contrastive learning do not translate from the natural image to the medical image domain. Secondly, the assumption made by conventional methods, that any two images are dissimilar, is systematically misleading in medical datasets depicting the same anatomy and disease. This is exacerbated in longitudinal image datasets that repeatedly image the same patient cohort to monitor their disease progression over time. In this paper we tackle these issues by extending conventional contrastive frameworks with a novel metadata-enhanced strategy. Our approach employs widely available patient metadata to approximate the true set of inter-image contrastive relationships. To this end we employ records for patient identity, eye position (i.e. left or right) and time series information. In experiments using two large longitudinal datasets containing 170,427 retinal optical coherence tomography (OCT) images of 7912 patients with age-related macular degeneration (AMD), we evaluate the utility of using metadata to incorporate the temporal dynamics of disease progression into pretraining. Our metadata-enhanced approach outperforms both standard contrastive methods and a retinal image foundation model in five out of six image-level downstream tasks related to AMD. We find benefits in both a low-data and high-data regime across tasks ranging from AMD stage and type classification to prediction of visual acuity. Due to its modularity, our method can be quickly and cost-effectively tested to establish the potential benefits of including available metadata in contrastive pretraining.
KW - Contrastive learning
KW - Longitudinal data
KW - Medical metadata
KW - Retinal OCT
KW - Self-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85201470320&partnerID=8YFLogxK
U2 - 10.1016/j.media.2024.103296
DO - 10.1016/j.media.2024.103296
M3 - Article
AN - SCOPUS:85201470320
SN - 1361-8415
VL - 97
JO - Medical Image Analysis
JF - Medical Image Analysis
M1 - 103296
ER -