Traditionally, approaches based on neural networks to solve the problem of disambiguation of the meaning of words (WSD) use a set of classifiers at the end, which results in a… Click to show full abstract
Traditionally, approaches based on neural networks to solve the problem of disambiguation of the meaning of words (WSD) use a set of classifiers at the end, which results in a specialization in a single set of words—those for which they were trained. This makes impossible to apply the learned models to words not previously seen in the training corpus. This paper seeks to address a generalization of the problem of WSD in order to solve it through deep neural networks without limiting the method to a fixed set of words, with a performance close to the state-of-the-art, and an acceptable computational cost. We explore different architectures based on multilayer perceptrons, recurrent cells (Long Short-Term Memory–LSTM and Gated Recurrent Units–GRU), and a classifier model. Different sources and dimensions of embeddings were tested as well. The main evaluation was performed on the Senseval 3 English Lexical Sample. To evaluate the application to an unseen set of words, learned models are evaluated in the completely unseen words of a different corpus (Senseval 2 English Lexical Sample), overcoming the random baseline.
               
Click one of the above tabs to view related content.