Training and testing process for the classification of biomedical datasets in machine learning is very important. The researcher should choose carefully the methods that should be used at every step.… Click to show full abstract
Training and testing process for the classification of biomedical datasets in machine learning is very important. The researcher should choose carefully the methods that should be used at every step. However, there are very few studies on method choices. The studies in the literature are generally theoretical. Besides, there is no useful model for how to select samples in the training and testing process. Therefore, there is a need for resources in machine learning that discuss the training and testing process in detail and offer new recommendations. This article provides a detailed analysis of the training and testing process in machine learning. The article has the following sections. The third section describes how to prepare the datasets. Four balanced datasets were used for the application. The fourth section describes the rate and how to select samples at the training and testing stage. The fundamental sampling theorem is the subject of statistics. It shows how to select samples. In this article, it has been proposed to use sampling methods in machine learning training and testing process. The fourth section covers the theoretic expression of four different sampling theorems. Besides, the results section has the results of the performance of sampling theorems. The fifth section describes the methods by which training and pretest features can be selected. In the study, three different classifiers control the performance. The results section describes how the results should be analyzed. Additionally, this article proposes performance evaluation methods to evaluate its results. This article examines the effect of the training and testing process on performance in machine learning in detail and proposes the use of sampling theorems for the training and testing process. According to the results, datasets, feature selection algorithms, classifiers, training, and test ratio are the criteria that directly affect performance. However, the methods of selecting samples at the training and testing stages are vital for the system to work correctly. In order to design a stable system, it is recommended that samples should be selected with a stratified systematic sampling theorem.
               
Click one of the above tabs to view related content.