Estimation of 6D Pose of Objects Based on a Variant Adversarial Autoencoder

Dan Huang, Hyemin Ahn, Shile Li, Yueming Hu, Dongheui Lee

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


The goal of this paper is to estimate object’s 6D pose based on the texture-less dataset. The pose of each projection view is obtained by rendering the 3D model of each object, and then the orientation feature of the object is implicitly represented by the latent space obtained from the RGB image. The 3D rotation of the object is estimated by establishing the codebook based on a template matching architecture. To build the latent space from the RGB images, this paper proposes a network based on a variant Adversarial Autoencoder (Makhzani et al. in Computer Science, 2015). To train the network, we use the dataset without pose annotation, and the encoder and decoder do not have a structural symmetry. The encoder is inspired by the existing model (Yang et al. in proceedings of IJCAI, 2018), (Yang et al. in proceedings 11 of CVPR, 2019) that incorporates the function of feature extraction from two different streams. Based on this network, the latent feature vector that implicitly represents the orientation of the object is obtained from the RGB image. Experimental results show that the method in this paper can realize the 6D pose estimation of the object and the result accuracy is better than the advanced method (Sundermeyer et al. in proceedings of ECCV, 2018).

Original languageEnglish
Pages (from-to)9581-9596
Number of pages16
JournalNeural Processing Letters
Issue number7
StatePublished - Dec 2023
Externally publishedYes


  • 6D pose
  • Adversarial autoencoder
  • RGB image
  • Self-supervised learning


Dive into the research topics of 'Estimation of 6D Pose of Objects Based on a Variant Adversarial Autoencoder'. Together they form a unique fingerprint.

Cite this