TY - JOUR
T1 - Gradient response maps for real-time detection of textureless objects
AU - Hinterstoisser, Stefan
AU - Cagniart, Cedric
AU - Ilic, Slobodan
AU - Sturm, Peter
AU - Navab, Nassir
AU - Fua, Pascal
AU - Lepetit, Vincent
N1 - Funding Information:
The authors thank Stefan Holzer and Kurt Konolige for the useful discussions and their valuable suggestions. This project was funded by the BMBF project AVILUSplus (01IM08002). P. Sturm is grateful to the Alexander-von-Humboldt Foundation for a Research Fellowship supporting his sabbatical at TU München. Nassir Navab and Vincent Lepetit are joint senior authors of this paper.
PY - 2012
Y1 - 2012
N2 - We present a method for real-time 3D object instance detection that does not require a time-consuming training stage, and can handle untextured objects. At its core, our approach is a novel image representation for template matching designed to be robust to small image transformations. This robustness is based on spread image gradient orientations and allows us to test only a small subset of all possible pixel locations when parsing the image, and to represent a 3D object with a limited set of templates. In addition, we demonstrate that if a dense depth sensor is available we can extend our approach for an even better performance also taking 3D surface normal orientations into account. We show how to take advantage of the architecture of modern computers to build an efficient but very discriminant representation of the input images that can be used to consider thousands of templates in real time. We demonstrate in many experiments on real data that our method is much faster and more robust with respect to background clutter than current state-of-the-art methods.
AB - We present a method for real-time 3D object instance detection that does not require a time-consuming training stage, and can handle untextured objects. At its core, our approach is a novel image representation for template matching designed to be robust to small image transformations. This robustness is based on spread image gradient orientations and allows us to test only a small subset of all possible pixel locations when parsing the image, and to represent a 3D object with a limited set of templates. In addition, we demonstrate that if a dense depth sensor is available we can extend our approach for an even better performance also taking 3D surface normal orientations into account. We show how to take advantage of the architecture of modern computers to build an efficient but very discriminant representation of the input images that can be used to consider thousands of templates in real time. We demonstrate in many experiments on real data that our method is much faster and more robust with respect to background clutter than current state-of-the-art methods.
KW - Computer vision
KW - multimodality template matching
KW - real-time detection and object recognition
KW - tracking
UR - http://www.scopus.com/inward/record.url?scp=84859168788&partnerID=8YFLogxK
U2 - 10.1109/TPAMI.2011.206
DO - 10.1109/TPAMI.2011.206
M3 - Article
C2 - 22442120
AN - SCOPUS:84859168788
SN - 0162-8828
VL - 34
SP - 876
EP - 888
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 5
M1 - 6042881
ER -