TY - JOUR
T1 - A feature-based approach to big data analysis of medical images
AU - Toews, Matthew
AU - Wachinger, Christian
AU - Estepar, Raul San Jose
AU - Wells, William M.
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2015.
PY - 2015
Y1 - 2015
N2 - This paper proposes an inference method well-suited to large sets of medical images. The method is based upon a framework where distinctive 3D scale-invariant features are indexed efficiently to identify approximate nearest-neighbor (NN) feature matches in O(log N) computational complexity in the number of images N. It thus scales well to large data sets, in contrast to methods based on pair-wise image registration or feature matching requiring O(N) complexity. Our theoretical contribution is a density estimator based on a generative model that generalizes kernel density estimation and K-nearest neighbor (KNN) methods. The estimator can be used for on-the-fly queries, without requiring explicit parametric models or an off-line training phase. The method is validated on a large multi-site data set of 95, 000, 000 features extracted from 19, 000 lung CT scans. Subject-level classification identifies all images of the same subjects across the entire data set despite deformation due to breathing state, including unintentional duplicate scans. State-of-the-art performance is achieved in predicting chronic pulmonary obstructive disorder (COPD) severity across the 5-category GOLD clinical rating, with an accuracy of 89% if both exact and one-off predictions are considered correct.
AB - This paper proposes an inference method well-suited to large sets of medical images. The method is based upon a framework where distinctive 3D scale-invariant features are indexed efficiently to identify approximate nearest-neighbor (NN) feature matches in O(log N) computational complexity in the number of images N. It thus scales well to large data sets, in contrast to methods based on pair-wise image registration or feature matching requiring O(N) complexity. Our theoretical contribution is a density estimator based on a generative model that generalizes kernel density estimation and K-nearest neighbor (KNN) methods. The estimator can be used for on-the-fly queries, without requiring explicit parametric models or an off-line training phase. The method is validated on a large multi-site data set of 95, 000, 000 features extracted from 19, 000 lung CT scans. Subject-level classification identifies all images of the same subjects across the entire data set despite deformation due to breathing state, including unintentional duplicate scans. State-of-the-art performance is achieved in predicting chronic pulmonary obstructive disorder (COPD) severity across the 5-category GOLD clinical rating, with an accuracy of 89% if both exact and one-off predictions are considered correct.
UR - http://www.scopus.com/inward/record.url?scp=84983611741&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-19992-4_26
DO - 10.1007/978-3-319-19992-4_26
M3 - Conference article
C2 - 26221685
AN - SCOPUS:84983611741
SN - 0302-9743
VL - 9123
SP - 339
EP - 350
JO - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
JF - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
M1 - A26
T2 - 24th International Conference on Information Processing in Medical Imaging, IPMI 2015
Y2 - 28 June 2015 through 3 July 2015
ER -