BACKGROUND Acute rheumatic fever (ARF) is an important disease that is frequently seen in Turkey, it is necessary to develop solutions to cure the disease. It is believed that new… Click to show full abstract
BACKGROUND Acute rheumatic fever (ARF) is an important disease that is frequently seen in Turkey, it is necessary to develop solutions to cure the disease. It is believed that new data analysis methods may be applied to this disease, and this may be useful to discover previously unrecognized patterns. Data mining of existing records and data repositories may improve knowledge on the diagnosis and management of ARF. In this regard, we planned to make a contribution to the development of new solutions by approaching the problem from a different standpoint. OBJECTIVES The aim of this study is to analyse the effects of ARF undergone during childhood on the basis of cardiac diseases by using data mining methods. MATERIALS AND METHODS Classification methods of data mining were used, and experiments were conducted on five algorithms. The records of the patients diagnosed with ARF were analysed by setting models with naive Bayes classifier, decision trees (CART, C4.5, C5.0, C5.0 boosted) and random forest algorithms. The performances of the algorithms that were derived were then compared. Among model performance evaluation techniques, the hold-out, cross-validation and bootstrap methods were tested in diverse ways in an applied manner. Within the scope of the research, the dataset comprising records of 297 patients was utilised in cooperation with İstanbul Medeniyet University Göztepe Training and Research Hospital's Pediatric Cardiology Clinic (İstanbul Medeniyet Üniversitesi Göztepe Eğitim ve Araştırma Hastanesi Çocuk Kardiyolojisi Kliniği). Data analysis was carried out with the data of the remaining 201 patients following pre-processing. RESULTS The results that were obtained from different algorithms were compared based on the model performance evaluation criteria. The best result was shown under the CART model by using the hold-out technique (80% training, 20% testing). According to this model, the importance values of the predictive attributes were listed, and it was found that the "teleNormal" and "cardiomegaly" attributes were not required for ARF diagnosis and treatment. In compliance with this result, it was thought that it should not be necessary for patients have a chest x-ray which is needed for diagnosis of "teleNormal" and "cardiomegaly". This will help reduce costs and thus contribute to the health economy while preventing patients from having unnecessary x-rays. DISCUSSION AND CONCLUSION The results of this study showed that data mining techniques may be used to analyse diseases such as ARF. The important attributes that affect the disease were obtained in accordance with the results. The results of the best model (CART) may be broadened in numerous ways and provide information for both experienced and inexperienced physicians. This study is considered to be significant as it helps data mining methods become more prevalently used for data analysis in fields of medicine and healthcare.
               
Click one of the above tabs to view related content.