NeuroGrasp: Multimodal Neural Network With Euler Region Regression for Neuromorphic Vision-Based Grasp Pose Estimation

Hu Cao, Guang Chen, Zhijun Li, Yingbai Hu, Alois Knoll

Research output: Contribution to journalArticlepeer-review

24 Scopus citations


Grasp pose estimation is a crucial procedure in robotic manipulation. Most of the current robot grasp manipulation systems are built on frame-based cameras like RGB-D cameras. However, the traditional frame-based grasp pose estimation methods have encountered challenges in scenarios such as low dynamic range and low power consumption. In this work, a neuromorphic vision sensor-dynamic and active-pixel vision sensor (DAVIS)-is introduced to the field of robotic grasp. DAVIS is an event-based bio-inspired vision sensor that records asynchronous streams of local pixel-level light intensity changes, called events. The strengths of DAVIS are it can provide high temporal resolution, high dynamic range, low power consumption, and no motion blur. We construct a neuromorphic vision-based robotic grasp dataset with 154 moving objects, named NeuroGrasp, which is the first RGB-Event multimodality grasp dataset (to the best of our knowledge). This dataset records both RGB frames and the corresponding event streams, providing frame data with rich color and texture information and event streams with high temporal resolution and high dynamic range. Based on the NeuroGrasp dataset, we further develop a multimodal neural network with a specific Euler region regression sub-network (ERRN) to perform grasp pose estimation. Combined with frame-based and event-based vision, the proposed method achieves better performance than the method that only takes RGB frames or event streams as input on the NeuroGrasp dataset.

Original languageEnglish
Article number2511111
JournalIEEE Transactions on Instrumentation and Measurement
StatePublished - 2022


  • Euler region regression sub-network (ERRN)
  • Grasp pose estimation
  • Multimodal fusion
  • Vision-based robotic manipulation


Dive into the research topics of 'NeuroGrasp: Multimodal Neural Network With Euler Region Regression for Neuromorphic Vision-Based Grasp Pose Estimation'. Together they form a unique fingerprint.

Cite this