"Urdu word sense disambiguation using machine learning approach"

This paper focuses on the word sense disambiguation (WSD) problem in the context of Urdu language. Word sense disambiguation (WSD) is a phenomena for disambiguating the text so that machine (computer) would be capable to deduce correct sense of individual given word(s). WSD is critical for solving natural language engineering (NLE) tasks such as machine translation and speech processing etc. It also increase the performance of other tasks such as text retrieval, document classification and document clustering etc. Research work in WSD has been conducted up to different extents in computationally developed languages of the world. In the context of Urdu language the NLE research in general and the WSD research in particular is still in the infancy stage due to the rich morphological structure of Urdu. In this paper, we use machine learning (ML) approaches such as Bayes net classifier (BN), support vector machine (SVM) and decision tree (DT) for WSD in native script Urdu text. The results shown that BN has better F-measure than SVM and DT. The maximum F-measure of 0.711 over 2.5 million words raw Urdu corpus was recorded for the Bayes net classifier.

Keywords: word; machine; sense disambiguation; word sense

Journal Title: Cluster Computing
Year Published: 2017

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
0

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended