"Multilabel Feature Selection Using Relief and Minimum Redundancy Maximum Relevance Based on Neighborhood Rough Sets"

Recently, multilabel classification is of increasing interest in machine learning and artificial intelligence. However, the distances of samples in most Relief methods easily result in heterogeneous or similar samples abnormal when the distances are very large. Besides, the classification margin as a neighborhood radius for some reduction algorithms may be meaningless when the margin is too large. To overcome these drawbacks, this paper presents a multilabel feature selection method using the improved Relief and minimum redundancy maximum relevance (MRMR) based on neighborhood rough sets. First, the number of heterogeneous and similar samples is introduced to improve the label weighting method which can eliminate the influence of the large distances of samples. By combining with the new label weighting, the distances between the sample and its nearest-neighbor heterogeneous sample and between the sample and its nearest-neighbor similar sample are presented to develop a new feature weighting method. Second, the number of heterogeneous and similar samples continues to be used to improve the classification margin, thereby constraining the neighborhood radius, based on which the neighborhood approximation accuracy is constructed to effectively measure the uncertainty of samples in the boundary region and the completeness of knowledge. Third, by integrating with the new neighborhood approximation accuracy, two types of mutual information between features and labels and among features are proposed, and then the mutual information-based MRMR model is investigated to evaluate the significance of features. Finally, a multilabel feature selection algorithm is designed for improving the classification performance of multilabel data. Experimental results on thirteen public datasets illustrate the effectiveness of our developed algorithm that can select the significant features and achieve great performance for multilabel datasets.

Keywords: based neighborhood; multilabel feature; relief minimum; feature selection; feature

Journal Title: IEEE Access
Year Published: 2020

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
0

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended