TY - GEN
T1 - Aligning latent spaces for 3D hand pose estimation
AU - Yang, Linlin
AU - Li, Shile
AU - Lee, Dongheui
AU - Yao, Angela
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/10
Y1 - 2019/10
N2 - Hand pose estimation from monocular RGB inputs is a highly challenging task. Many previous works for monocular settings only used RGB information for training despite the availability of corresponding data in other modalities such as depth maps. In this work, we propose to learn a joint latent representation that leverages other modalities as weak labels to boost the RGB-based hand pose estimator. By design, our architecture is highly flexible in embedding various diverse modalities such as heat maps, depth maps and point clouds. In particular, we find that encoding and decoding the point cloud of the hand surface can improve the quality of the joint latent representation. Experiments show that with the aid of other modalities during training, our proposed method boosts the accuracy of RGB-based hand pose estimation systems and significantly outperforms state-of-the-art on two public benchmarks.
AB - Hand pose estimation from monocular RGB inputs is a highly challenging task. Many previous works for monocular settings only used RGB information for training despite the availability of corresponding data in other modalities such as depth maps. In this work, we propose to learn a joint latent representation that leverages other modalities as weak labels to boost the RGB-based hand pose estimator. By design, our architecture is highly flexible in embedding various diverse modalities such as heat maps, depth maps and point clouds. In particular, we find that encoding and decoding the point cloud of the hand surface can improve the quality of the joint latent representation. Experiments show that with the aid of other modalities during training, our proposed method boosts the accuracy of RGB-based hand pose estimation systems and significantly outperforms state-of-the-art on two public benchmarks.
UR - http://www.scopus.com/inward/record.url?scp=85081936061&partnerID=8YFLogxK
U2 - 10.1109/ICCV.2019.00242
DO - 10.1109/ICCV.2019.00242
M3 - Conference contribution
AN - SCOPUS:85081936061
T3 - Proceedings of the IEEE International Conference on Computer Vision
SP - 2335
EP - 2343
BT - Proceedings - 2019 International Conference on Computer Vision, ICCV 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 17th IEEE/CVF International Conference on Computer Vision, ICCV 2019
Y2 - 27 October 2019 through 2 November 2019
ER -