Heart disease is the leading cause of death worldwide. A Machine Learning (ML) system can detect heart disease in the early stages to mitigate mortality rates based on clinical data.… Click to show full abstract
Heart disease is the leading cause of death worldwide. A Machine Learning (ML) system can detect heart disease in the early stages to mitigate mortality rates based on clinical data. However, the class imbalance and high dimensionality issues have been a persistent challenge in ML, preventing accurate predictive data analysis in many real-world applications, including heart disease detection. In this regard, this work proposes a new method to address these issues and improve the predict the presence of heart disease and patients’ survival, including supervised infinite feature selection (Inf-FSs) to find the most significant features and Improved Weighted Random Forest (IWRF) to predict heart disease, and Bayesian optimization to tune the new hyperparameters for IWRF. Two public datasets, including Statlog and heart disease clinical records, were used to develop and validate the proposed model. The proposed model is compared with other hybrid models to show its superiority using performance metrics like accuracy and f-measure to evaluate the models’ performance. The results have shown that the proposed Inf-FSs-IWRF achieved better results than other models in attaining higher accuracy and F-measure on both datasets. Additionally, a comparative study has been performed to compare with previous studies, where the proposed model outperformed the others by an accuracy improvement of 2.4% and 4.6% on both datasets, respectively.
               
Click one of the above tabs to view related content.