LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

The Empirical Study of Semi-Supervised Deep Fuzzy C-Mean Clustering for Software Fault Prediction

Photo by molly7 from unsplash

Software fault prediction is a very consequent research topic for software quality assurance. The performance of fault prediction model depends on the features that are used to train it. Redundant… Click to show full abstract

Software fault prediction is a very consequent research topic for software quality assurance. The performance of fault prediction model depends on the features that are used to train it. Redundant and irrelevant features can hinder the performance of a classification model. In this paper, we propose an empirical study of two-stage data pre-processing technique on software fault prediction models. In the first stage, a novel semi-supervised deep Fuzzy C-Mean (DFCM) clustering-based feature extraction technique is proposed to create new features by utilizing deep multi-clusters of unlabeled and labeled data sets that tends to maximize intra-cluster class and intra-cluster feature by using FCM clustering. The FCM also utilizes to handle the class imbalance problem. In the second stage, we further ameliorate the prediction performance with coalescence of feature selection (using random-under sampling) to reduce the noisy data for classification. However, by the performance of the model results in the amalgamation of novel DFCM data pre-processing approach work better due to their ability to identify and amalgamation essential information in data features. An empirical study is designed on real-world software project (NASA & Eclipse) data set to evaluate the performance of DFCM by implemented different data pre-processing schemes on prediction models (C4.5, naive bayes, and 1-near neighbor (1-NN)), which are widely used in software fault prediction and further investigated the influencing factors in our approach. The result shows that the performance of the proposed DFCM feature extraction technique for data pre-processing is stable and effectiveness on all prediction models.

Keywords: fault prediction; software fault; performance; prediction; software

Journal Title: IEEE Access
Year Published: 2018

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.