"Semantic Similarity Estimation Using Vector Symbolic Architectures"

For many natural language processing applications, estimating similarity and relatedness between words are key tasks that serve as the basis for classification and generalization. Currently, vector semantic models (VSM) have become a fundamental language modeling tool. VSMs represent words as points in a high-dimensional space and follow the distributional hypothesis of meaning, which assumes that semantic similarity is related to the context. In this paper, we propose a model whose representations are based on the semantic features associated with a concept within the ConceptNet knowledge graph. The proposed model is based on a vector symbolic architecture framework, which defines a set of arithmetic operations to encode the semantic features within a single high-dimensional vector. In addition to word distribution, these vector representations consider several types of information. Moreover, owing to the properties of high-dimensional spaces, they have the additional advantage of being interpretable. We analyze the model’s performance on the SimLex-999 dataset, a dataset where commonly used distributional models (e.g., word2vec or GloVe) perform poorly. Our results are similar to those of other hybrid models, and they surpass several state-of-the-art distributional and knowledge-based models.

Keywords: vector symbolic; similarity estimation; semantic similarity; high dimensional; similarity; vector

Journal Title: IEEE Access
Year Published: 2020

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
0

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended