TY - JOUR
T1 - A Transfer Learning Approach to Cross-Modal Object Recognition
T2 - From Visual Observation to Robotic Haptic Exploration
AU - Falco, Pietro
AU - Lu, Shuang
AU - Natale, Ciro
AU - Pirozzi, Salvatore
AU - Lee, Dongheui
N1 - Publisher Copyright:
© 2004-2012 IEEE.
PY - 2019/8
Y1 - 2019/8
N2 - In this paper, we introduce the problem of cross-modal visuo-tactile object recognition with robotic active exploration. With this term, we mean that the robot observes a set of objects with visual perception, and later on, it is able to recognize such objects only with tactile exploration, without having touched any object before. Using a machine learning terminology, in our application, we have a visual training set and a tactile test set, or vice versa. To tackle this problem, we propose an approach constituted by four steps: finding a visuo-tactile common representation, defining a suitable set of features, transferring the features across the domains, and classifying the objects. We show the results of our approach using a set of 15 objects, collecting 40 visual examples and five tactile examples for each object. The proposed approach achieves an accuracy of 94.7%, which is comparable with the accuracy of the monomodal case, i.e., when using visual data both as training set and test set. Moreover, it performs well compared to the human ability, which we have roughly estimated carrying out an experiment with ten participants.
AB - In this paper, we introduce the problem of cross-modal visuo-tactile object recognition with robotic active exploration. With this term, we mean that the robot observes a set of objects with visual perception, and later on, it is able to recognize such objects only with tactile exploration, without having touched any object before. Using a machine learning terminology, in our application, we have a visual training set and a tactile test set, or vice versa. To tackle this problem, we propose an approach constituted by four steps: finding a visuo-tactile common representation, defining a suitable set of features, transferring the features across the domains, and classifying the objects. We show the results of our approach using a set of 15 objects, collecting 40 visual examples and five tactile examples for each object. The proposed approach achieves an accuracy of 94.7%, which is comparable with the accuracy of the monomodal case, i.e., when using visual data both as training set and test set. Moreover, it performs well compared to the human ability, which we have roughly estimated carrying out an experiment with ten participants.
KW - Cross-modal object recognition
KW - robotic manipulation
KW - tactile perception
KW - visual perception
UR - http://www.scopus.com/inward/record.url?scp=85070506743&partnerID=8YFLogxK
U2 - 10.1109/TRO.2019.2914772
DO - 10.1109/TRO.2019.2914772
M3 - Article
AN - SCOPUS:85070506743
SN - 1552-3098
VL - 35
SP - 987
EP - 998
JO - IEEE Transactions on Robotics
JF - IEEE Transactions on Robotics
IS - 4
M1 - 8744477
ER -