Learning unknown objects in the environment is important for detection and manipulation tasks. Prior to learning the unknown objects the ground-truth labels have to be provided. The data annotation or… Click to show full abstract
Learning unknown objects in the environment is important for detection and manipulation tasks. Prior to learning the unknown objects the ground-truth labels have to be provided. The data annotation or labeling can be achieved in a number of ways but the most widely used method is still manual annotation. Although manual annotation has shown superior performance, it limits robots’ capabilities to known object instances and is also a time consuming task. This letter considers the aforementioned limitations and presents a method that allows robots to autonomously annotate objects from observations of human–object interactions. Specifically, we present a novel method that segments handheld objects in real-time using the class-agnostic deep comparison and segmentation network. The inputs to the network are the RGB-D data of known object template and a search space, and it outputs a pixel-wise label of the object and an objectness score. The score indicates the likelihood that the same object is present in both the inputs. The object template is manually initialized in the first frame and thereafter, the object is segmented and the template is updated online. The template is strategically updated using the likelihood score. The segmented object regions are accumulated as pseudo-ground-truth labels, which are used for object learning. The approach efficiently handles both rigid and highly deformable objects.
               
Click one of the above tabs to view related content.