For the past few decades, machines have replaced humans in several disciplines. However, machine cognition still lags behind the human capabilities. We address the machines’ ability to recognize human drawn… Click to show full abstract
For the past few decades, machines have replaced humans in several disciplines. However, machine cognition still lags behind the human capabilities. We address the machines’ ability to recognize human drawn sketches in this work. Visual representations, such as sketches have long been a medium of communication for humans. For artificially intelligent systems to effectively immerse in interactive environments, it is required that machines understand such notations. The abstract nature and varied artistic styling of these sketches make automatic recognition of drawings more challenging than other areas of image classification. In this article, we use sketches represented as a sequence of strokes, i.e., as vector images, to effectively capture the long-term temporal dependencies in hand-drawn sketches. The proposed approach combines the self-attention capabilities of Transformers while effectively utilizing the long-term temporal dependencies through temporal convolution networks (TCNs) for sketch recognition. The confidence scores obtained from the two techniques are combined using triangular-norm (T-norm). Attention heat maps are plotted to isolate the discriminating parts of a sketch that contribute to sketch classification. The extensive quantitative and qualitative evaluation confirms that the proposed network performs favorably against state-of-the-art techniques.
               
Click one of the above tabs to view related content.