In the face of massive texts, dimensionality reduction algorithm and efficient classification model have become the key steps for sentiment classification of microblogs. X2 statistics and TF-IDF statistics are commonly… Click to show full abstract
In the face of massive texts, dimensionality reduction algorithm and efficient classification model have become the key steps for sentiment classification of microblogs. X2 statistics and TF-IDF statistics are commonly used dimension reduction methods. When applied to micro-blog sentiment analysis, traditional X2 statistics do not consider the probability of a certain sentiment word in a micro-blog text. TF-IDF weight measure ignores the synonyms in the text of micro-blog. Therefore, this paper proposes a NewChi-TF-IDF feature selection method combining form and semantics. In the classification stage, the generalization performance of single classifier is low. To enhance the generalization performance of Weibo sentiment classification, based on the existing ensemble strategy, differential evolution algorithm is introduced to assign different excitation functions to multiple weak classifiers to train the optimal weight distribution. Thus, the problem that the weight of weak classifier is difficult to determine is solved. Experimental results show that NewChi-TF-IDF feature selection method reduces the dimension, and the generalization ability of the proposed algorithm is enhanced, and the average F-score of the proposed algorithm is improved to a higher degree than that of Ada-All and Vote-All classifiers.
               
Click one of the above tabs to view related content.