Dihydrofolate reductase (DHFR) enzyme is a crucial component of cell growth and proliferation in the human body, making it an important target for treating cancer diseases. This study aims to… Click to show full abstract
Dihydrofolate reductase (DHFR) enzyme is a crucial component of cell growth and proliferation in the human body, making it an important target for treating cancer diseases. This study aims to predict the inhibitory activity (pXC50) of dihydrofolate reductase inhibitors in terms of the quantitative structure-activity relationship (QSAR) model. Interpretation of the QSAR model is vital for understanding the physicochemical processes and to assist structural optimisation. Multivariate adaptive regression splines (MARS), a non-parametric technique, is proposed to model the non-linear relationship between the predictor variables and the response variable of a high-dimensional dataset. The dataset used in this research consists of pXC50 activity of 778 DHFR inhibitors. For our study, the data is divided into 80% training set for model building and 20% testing set for model validation. In comparison, the baseline methods deep neural network (DNN) and partial least squares (PLS) are also applied to QSAR modeling. The testing results show that MARS has the best prediction accuracy according to different measures, where RMSE, MAE, MAPE, and RMSPE are 0.96, 0.69, 0.11, and 0.15 respectively. The efficiency of MARS is apparent in its robust interaction of variables, prediction accuracy, and ability to overcome the neural network’s black box system. Thus, MARS technique can be considered an excellent tool for modeling QSAR high-dimensional datasets while exploring the non-linear patterns of data.
               
Click one of the above tabs to view related content.