TY - JOUR
T1 - Multitask deep learning for segmentation and classification of primary bone tumors on radiographs
AU - von Schacky, Claudio E.
AU - Wilhelm, Nikolas J.
AU - Schäfer, Valerie S.
AU - Leonhardt, Yannik
AU - Gassert, Felix G.
AU - Foreman, Sarah C.
AU - Gassert, Florian T.
AU - Jung, Matthias
AU - Jungmann, Pia M.
AU - Russe, Maximilian F.
AU - Mogler, Carolin
AU - Knebel, Carolin
AU - von Eisenhart-Rothe, Rüdiger
AU - Makowski, Marcus R.
AU - Woertler, Klaus
AU - Burgkart, Rainer
AU - Gersing, Alexandra S.
N1 - Publisher Copyright:
© 2021 Radiological Society of North America Inc.. All rights reserved.
PY - 2021/11
Y1 - 2021/11
N2 - Background: An artificial intelligence model that assesses primary bone tumors on radiographs may assist in the diagnostic workflow. Purpose: To develop a multitask deep learning (DL) model for simultaneous bounding box placement, segmentation, and classification of primary bone tumors on radiographs. Materials and Methods: This retrospective study analyzed bone tumors on radiographs acquired prior to treatment and obtained from patient data from January 2000 to June 2020. Benign or malignant bone tumors were diagnosed in all patients by using the histopathologic findings as the reference standard. By using split-sample validation, 70% of the patients were assigned to the training set, 15% were assigned to the validation set, and 15% were assigned to the test set. The final performance was evaluated on an external test set by using geographic validation, with accuracy, sensitivity, specificity, and 95% CIs being used for classification, the intersection over union (IoU) being used for bounding box placements, and the Dice score being used for segmentations. Results: Radiographs from 934 patients (mean age, 33 years 6 19 [standard deviation]; 419 women) were evaluated in the internal data set, which included 667 benign bone tumors and 267 malignant bone tumors. Six hundred fifty-four patients were in the training set, 140 were in the validation set, and 140 were in the test set. One hundred eleven patients were in the external test set. The multitask DL model achieved 80.2% (89 of 111; 95% CI: 72.8, 87.6) accuracy, 62.9% (22 of 35; 95% CI: 47, 79) sensitivity, and 88.2% (67 of 76; CI: 81, 96) specificity in the classification of bone tumors as malignant or benign. The model achieved an IoU of 0.52 6 0.34 for bounding box placements and a mean Dice score of 0.60 6 0.37 for segmentations. The model accuracy was higher than that of two radiologic residents (71.2% and 64.9%; P = .002 and P , .001, respectively) and was comparable with that of two musculoskeletal fellowship–trained radiologists (83.8% and 82.9%; P = .13 and P = .25, respectively) in classifying a tumor as malignant or benign. Conclusion: The developed multitask deep learning model allowed for accurate and simultaneous bounding box placement, segmentation, and classification of primary bone tumors on radiographs.
AB - Background: An artificial intelligence model that assesses primary bone tumors on radiographs may assist in the diagnostic workflow. Purpose: To develop a multitask deep learning (DL) model for simultaneous bounding box placement, segmentation, and classification of primary bone tumors on radiographs. Materials and Methods: This retrospective study analyzed bone tumors on radiographs acquired prior to treatment and obtained from patient data from January 2000 to June 2020. Benign or malignant bone tumors were diagnosed in all patients by using the histopathologic findings as the reference standard. By using split-sample validation, 70% of the patients were assigned to the training set, 15% were assigned to the validation set, and 15% were assigned to the test set. The final performance was evaluated on an external test set by using geographic validation, with accuracy, sensitivity, specificity, and 95% CIs being used for classification, the intersection over union (IoU) being used for bounding box placements, and the Dice score being used for segmentations. Results: Radiographs from 934 patients (mean age, 33 years 6 19 [standard deviation]; 419 women) were evaluated in the internal data set, which included 667 benign bone tumors and 267 malignant bone tumors. Six hundred fifty-four patients were in the training set, 140 were in the validation set, and 140 were in the test set. One hundred eleven patients were in the external test set. The multitask DL model achieved 80.2% (89 of 111; 95% CI: 72.8, 87.6) accuracy, 62.9% (22 of 35; 95% CI: 47, 79) sensitivity, and 88.2% (67 of 76; CI: 81, 96) specificity in the classification of bone tumors as malignant or benign. The model achieved an IoU of 0.52 6 0.34 for bounding box placements and a mean Dice score of 0.60 6 0.37 for segmentations. The model accuracy was higher than that of two radiologic residents (71.2% and 64.9%; P = .002 and P , .001, respectively) and was comparable with that of two musculoskeletal fellowship–trained radiologists (83.8% and 82.9%; P = .13 and P = .25, respectively) in classifying a tumor as malignant or benign. Conclusion: The developed multitask deep learning model allowed for accurate and simultaneous bounding box placement, segmentation, and classification of primary bone tumors on radiographs.
UR - http://www.scopus.com/inward/record.url?scp=85116865642&partnerID=8YFLogxK
U2 - 10.1148/radiol.2021204531
DO - 10.1148/radiol.2021204531
M3 - Review article
C2 - 34491126
AN - SCOPUS:85116865642
SN - 0033-8419
VL - 301
SP - 398
EP - 406
JO - Radiology
JF - Radiology
IS - 2
ER -