TY - JOUR
T1 - Performance evaluation of automated white matter hyperintensity segmentation algorithms in a multicenter cohort on cognitive impairment and dementia
AU - for the DELCODE study group
AU - Gaubert, Malo
AU - Dell’Orco, Andrea
AU - Lange, Catharina
AU - Garnier-Crussard, Antoine
AU - Zimmermann, Isabella
AU - Dyrba, Martin
AU - Duering, Marco
AU - Ziegler, Gabriel
AU - Peters, Oliver
AU - Preis, Lukas
AU - Priller, Josef
AU - Spruth, Eike Jakob
AU - Schneider, Anja
AU - Fliessbach, Klaus
AU - Wiltfang, Jens
AU - Schott, Björn H.
AU - Maier, Franziska
AU - Glanz, Wenzel
AU - Buerger, Katharina
AU - Janowitz, Daniel
AU - Perneczky, Robert
AU - Rauchmann, Boris Stephan
AU - Teipel, Stefan
AU - Kilimann, Ingo
AU - Laske, Christoph
AU - Munk, Matthias H.
AU - Spottke, Annika
AU - Roy, Nina
AU - Dobisch, Laura
AU - Ewers, Michael
AU - Dechent, Peter
AU - Haynes, John Dylan
AU - Scheffler, Klaus
AU - Düzel, Emrah
AU - Jessen, Frank
AU - Wirth, Miranka
N1 - Publisher Copyright:
Copyright © 2023 Gaubert, Dell’Orco, Lange, Garnier-Crussard, Zimmermann, Dyrba, Duering, Ziegler, Peters, Preis, Priller, Spruth, Schneider, Fliessbach, Wiltfang, Schott, Maier, Glanz, Buerger, Janowitz, Perneczky, Rauchmann, Teipel, Kilimann, Laske, Munk, Spottke, Roy, Dobisch, Ewers, Dechent, Haynes, Scheffler, Düzel, Jessen and Wirth.
PY - 2023/1/12
Y1 - 2023/1/12
N2 - Background: White matter hyperintensities (WMH), a biomarker of small vessel disease, are often found in Alzheimer’s disease (AD) and their advanced detection and quantification can be beneficial for research and clinical applications. To investigate WMH in large-scale multicenter studies on cognitive impairment and AD, appropriate automated WMH segmentation algorithms are required. This study aimed to compare the performance of segmentation tools and provide information on their application in multicenter research. Methods: We used a pseudo-randomly selected dataset (n = 50) from the DZNE-multicenter observational Longitudinal Cognitive Impairment and Dementia Study (DELCODE) that included 3D fluid-attenuated inversion recovery (FLAIR) images from participants across the cognitive continuum. Performances of top-rated algorithms for automated WMH segmentation [Brain Intensity Abnormality Classification Algorithm (BIANCA), lesion segmentation toolbox (LST), lesion growth algorithm (LGA), LST lesion prediction algorithm (LPA), pgs, and sysu_media] were compared to manual reference segmentation (RS). Results: Across tools, segmentation performance was moderate for global WMH volume and number of detected lesions. After retraining on a DELCODE subset, the deep learning algorithm sysu_media showed the highest performances with an average Dice’s coefficient of 0.702 (±0.109 SD) for volume and a mean F1-score of 0.642 (±0.109 SD) for the number of lesions. The intra-class correlation was excellent for all algorithms (>0.9) but BIANCA (0.835). Performance improved with high WMH burden and varied across brain regions. Conclusion: To conclude, the deep learning algorithm, when retrained, performed well in the multicenter context. Nevertheless, the performance was close to traditional methods. We provide methodological recommendations for future studies using automated WMH segmentation to quantify and assess WMH along the continuum of cognitive impairment and AD dementia.
AB - Background: White matter hyperintensities (WMH), a biomarker of small vessel disease, are often found in Alzheimer’s disease (AD) and their advanced detection and quantification can be beneficial for research and clinical applications. To investigate WMH in large-scale multicenter studies on cognitive impairment and AD, appropriate automated WMH segmentation algorithms are required. This study aimed to compare the performance of segmentation tools and provide information on their application in multicenter research. Methods: We used a pseudo-randomly selected dataset (n = 50) from the DZNE-multicenter observational Longitudinal Cognitive Impairment and Dementia Study (DELCODE) that included 3D fluid-attenuated inversion recovery (FLAIR) images from participants across the cognitive continuum. Performances of top-rated algorithms for automated WMH segmentation [Brain Intensity Abnormality Classification Algorithm (BIANCA), lesion segmentation toolbox (LST), lesion growth algorithm (LGA), LST lesion prediction algorithm (LPA), pgs, and sysu_media] were compared to manual reference segmentation (RS). Results: Across tools, segmentation performance was moderate for global WMH volume and number of detected lesions. After retraining on a DELCODE subset, the deep learning algorithm sysu_media showed the highest performances with an average Dice’s coefficient of 0.702 (±0.109 SD) for volume and a mean F1-score of 0.642 (±0.109 SD) for the number of lesions. The intra-class correlation was excellent for all algorithms (>0.9) but BIANCA (0.835). Performance improved with high WMH burden and varied across brain regions. Conclusion: To conclude, the deep learning algorithm, when retrained, performed well in the multicenter context. Nevertheless, the performance was close to traditional methods. We provide methodological recommendations for future studies using automated WMH segmentation to quantify and assess WMH along the continuum of cognitive impairment and AD dementia.
KW - Alzheimer’s disease
KW - FLAIR
KW - aging
KW - deep learning
KW - evaluation
KW - white matter hyperintensities segmentation
UR - http://www.scopus.com/inward/record.url?scp=85147115358&partnerID=8YFLogxK
U2 - 10.3389/fpsyt.2022.1010273
DO - 10.3389/fpsyt.2022.1010273
M3 - Article
AN - SCOPUS:85147115358
SN - 1664-0640
VL - 13
JO - Frontiers in Psychiatry
JF - Frontiers in Psychiatry
M1 - 1010273
ER -