LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Study on suitability and importance of multilayer extreme learning machine for classification of text data

Photo by aleexcif from unsplash

The dynamic Web, which contains huge number of digital documents, is expanding day by day. Thus, it has become a tough challenge to search for a particular document from such… Click to show full abstract

The dynamic Web, which contains huge number of digital documents, is expanding day by day. Thus, it has become a tough challenge to search for a particular document from such a large volume of collections. Text classification is a technique which can speed up the search and retrieval tasks and hence is the need of the hour. Aiming in this direction, this study proposes an efficient technique that uses the concept of connected component (CC) of a graph and Wordnet along with four established feature selection techniques [e.g., TF-IDF, Chi-square, Bi-Normal Separation (BNS) and Information Gain (IG)] to select the best features from a given input dataset in order to prepare an efficient training feature vector. Next, multilayer extreme learning machine (ML-ELM) (which is based on the architecture of deep learning) and other state-of-the-art classifiers are trained on this efficient training feature vector for classification of text data. The experimental work has been carried out on DMOZ and 20-Newsgroups datasets. We have studied the behavior and compared the results of different classifiers using these four important feature selection techniques used for classification process and observed that ML-ELM achieved the maximum overall F-measure of 72.28 % on DMOZ dataset using TF-IDF as the feature selection technique and 81.53 % on 20-Newsgroups dataset using BNS as the feature selection technique compared to other state-of-the-art classifiers which signifies the usefulness of deep learning used by ML-ELM for classifying the text data. Experimental results on these benchmark datasets show the stability and effectiveness of our approach over other competing approaches.

Keywords: classification; feature selection; feature; text data; multilayer extreme

Journal Title: Soft Computing
Year Published: 2017

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.