TY - JOUR
T1 - Performance-aware programming for intraoperative intensity-based image registration on graphics processing units
AU - Leong, Martin C.W.
AU - Lee, Kit Hang
AU - Kwan, Bowen P.Y.
AU - Ng, Yui Lun
AU - Liu, Zhiyu
AU - Navab, Nassir
AU - Luk, Wayne
AU - Kwok, Ka Wai
N1 - Publisher Copyright:
© 2021, The Author(s).
PY - 2021/3
Y1 - 2021/3
N2 - Purpose: Intensity-based image registration has been proven essential in many applications accredited to its unparalleled ability to resolve image misalignments. However, long registration time for image realignment prohibits its use in intra-operative navigation systems. There has been much work on accelerating the registration process by improving the algorithm’s robustness, but the innate computation required by the registration algorithm has been unresolved. Methods: Intensity-based registration methods involve operations with high arithmetic load and memory access demand, which supposes to be reduced by graphics processing units (GPUs). Although GPUs are widespread and affordable, there is a lack of open-source GPU implementations optimized for non-rigid image registration. This paper demonstrates performance-aware programming techniques, which involves systematic exploitation of GPU features, by implementing the diffeomorphic log-demons algorithm. Results: By resolving the pinpointed computation bottlenecks on GPU, our implementation of diffeomorphic log-demons on Nvidia GTX Titan X GPU has achieved ~ 95 times speed-up compared to the CPU and registered a 1.3-M voxel image in 286 ms. Even for large 37-M voxel images, our implementation is able to register in 8.56 s, which attained ~ 258 times speed-up. Our solution involves effective employment of GPU computation units, memory, and data bandwidth to resolve computation bottlenecks. Conclusion: The computation bottlenecks in diffeomorphic log-demons are pinpointed, analyzed, and resolved using various GPU performance-aware programming techniques. The proposed fast computation on basic image operations not only enhances the computation of diffeomorphic log-demons, but is also potentially extended to speed up many other intensity-based approaches. Our implementation is open-source on GitHub at https://bit.ly/2PYZxQz.
AB - Purpose: Intensity-based image registration has been proven essential in many applications accredited to its unparalleled ability to resolve image misalignments. However, long registration time for image realignment prohibits its use in intra-operative navigation systems. There has been much work on accelerating the registration process by improving the algorithm’s robustness, but the innate computation required by the registration algorithm has been unresolved. Methods: Intensity-based registration methods involve operations with high arithmetic load and memory access demand, which supposes to be reduced by graphics processing units (GPUs). Although GPUs are widespread and affordable, there is a lack of open-source GPU implementations optimized for non-rigid image registration. This paper demonstrates performance-aware programming techniques, which involves systematic exploitation of GPU features, by implementing the diffeomorphic log-demons algorithm. Results: By resolving the pinpointed computation bottlenecks on GPU, our implementation of diffeomorphic log-demons on Nvidia GTX Titan X GPU has achieved ~ 95 times speed-up compared to the CPU and registered a 1.3-M voxel image in 286 ms. Even for large 37-M voxel images, our implementation is able to register in 8.56 s, which attained ~ 258 times speed-up. Our solution involves effective employment of GPU computation units, memory, and data bandwidth to resolve computation bottlenecks. Conclusion: The computation bottlenecks in diffeomorphic log-demons are pinpointed, analyzed, and resolved using various GPU performance-aware programming techniques. The proposed fast computation on basic image operations not only enhances the computation of diffeomorphic log-demons, but is also potentially extended to speed up many other intensity-based approaches. Our implementation is open-source on GitHub at https://bit.ly/2PYZxQz.
KW - Demons algorithm
KW - Image-guided treatment
KW - Non-rigid registration
KW - Parallel computing
KW - Surgical guidance
UR - http://www.scopus.com/inward/record.url?scp=85099742737&partnerID=8YFLogxK
U2 - 10.1007/s11548-020-02303-y
DO - 10.1007/s11548-020-02303-y
M3 - Article
C2 - 33484431
AN - SCOPUS:85099742737
SN - 1861-6410
VL - 16
SP - 375
EP - 386
JO - International Journal of Computer Assisted Radiology and Surgery
JF - International Journal of Computer Assisted Radiology and Surgery
IS - 3
ER -