Pose estimation is an essential technology for product grasping and assembly in intelligent manufacturing. Finding local correspondences between the 2-D image and the 3-D model is the key step to… Click to show full abstract
Pose estimation is an essential technology for product grasping and assembly in intelligent manufacturing. Finding local correspondences between the 2-D image and the 3-D model is the key step to estimate the 6-D pose of an object. However, when the objects are textureless, it is difficult to identify distinguishable point features. In this article, we propose a novel deep learning framework called the pseudo-Siamese graph matching network to tackle the problem of feature matching of textureless objects and estimate accurate object poses with a single RGB-only image. We utilize a pseudo-Siamese network structure to learn the similarity between the 2-D image features and the 3-D mesh model of the object. A fully convolutional network and a graph convolutional network are used to extract high-dimensional deep features of the 2-D image and the 3-D model, respectively. Dense 2-D–3-D correspondences are inferred using the pseudo-Siamese matching network. Then, the pose of the object is calculated by the Perspective-n-Point and random sample consensus (RANSAC) methods. Experiments on the LINEMOD dataset and a grasping task for metal part show the accuracy and robustness of our proposed method.1
               
Click one of the above tabs to view related content.