Infrared spectroscopy can quickly and non-destructively extract analytical information from samples. It can be applied to the authenticity identification of various Chinese herbal medicines, the prediction of the mixing amount… Click to show full abstract
Infrared spectroscopy can quickly and non-destructively extract analytical information from samples. It can be applied to the authenticity identification of various Chinese herbal medicines, the prediction of the mixing amount of defective products, and the analysis of the origin. In this paper, the spectral information of Cornus officinalis from 11 origins was used as the research object, and the origin identification model of Cornus officinalis based on mid-infrared spectroscopy was established. First, principal component analysis was used to extract the absorbance data of Cornus officinalis in the wavenumber range of 551~3998 cm–1. The extracted principal components contain more than 99.8% of the information of the original data. Second, the extracted principal component information was used as input, and the origin category was used as output, and the origin identification model was trained with the help of support vector machine. In this paper, this combined model is called PCA-SVM combined model. Finally, the generalization ability of the PCA-SVM model is evaluated through an external test set. The three indicators of Accuracy, F1-Score, and Kappa coefficient are used to compare this model with other commonly used classification models such as naive Bayes model, decision trees, linear discriminant analysis, radial basis function neural network and partial least square discriminant analysis. The results show that PCA-SVM model is superior to other commonly used models in accuracy, F1 score and Kappa coefficient. In addition, compared with the SVM model with full spectrum data, the PCA-SVM model not only reduces the redundant variables in the model, but also has higher accuracy. Using this model to identify the origin of Cornus officinalis, the accuracy rate is 84.8%.
               
Click one of the above tabs to view related content.