LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Complete Random Forest Based Class Noise Filtering Learning for Improving the Generalizability of Classifiers

Photo by sebastian_unrau from unsplash

The existing noise detection methods required the classifiers or distance measurements or data overall distribution, and ‘curse of dimensionality’ and other restrictions made them insufficiently effective in complex data, e.g.,… Click to show full abstract

The existing noise detection methods required the classifiers or distance measurements or data overall distribution, and ‘curse of dimensionality’ and other restrictions made them insufficiently effective in complex data, e.g., different attribute weights, high-dimensionality, containing feature noise, nonlinearity, etc. This is also the main reason that the existing noise filtering methods were not widely applied and formed an effective learning framework. To address this problem, we propose here a complete and efficient random forest method (CRF) specifically for the class noise detection by simulating the grid generation and expansion. The CRF is not based on distance measures or overall distribution or classifiers; besides, the voting mechanism makes it able to effectively process datasets containing feature noise. Furthermore, we introduce CRF based class noise filtering learning framework (CRF-NFL) and derive its mathematical model. The framework is then applied to many widely used classifiers including some state-of-the-art algorithms, e.g., k-means tree, GBDT, and XGBoost. Moreover, its parallelized is designed for large-scale data. The CRF-NFL show much better generalizability than the conventional classifiers and the relative density-based method, which is the most effective noise filtering method as far as we know. All research has formed an open source library, called CRF-NFL: http://www.cquptshuyinxia.com/CRF-NFL.html.

Keywords: class noise; random forest; noise filtering; crf; noise

Journal Title: IEEE Transactions on Knowledge and Data Engineering
Year Published: 2019

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.