RNA 5-hydroxymethylcytosine (5 hmC) is an important RNA modification, which plays vital role in several biological processes. Currently, it is a hot topic to identify 5[Formula: see text]hmC sites due… Click to show full abstract
RNA 5-hydroxymethylcytosine (5 hmC) is an important RNA modification, which plays vital role in several biological processes. Currently, it is a hot topic to identify 5[Formula: see text]hmC sites due to its benefit in understanding its biological functions. Therefore, in this study, we developed a predictor called iRNA5 hmC-HOC, which is based on a high-order correlation information method to identify 5[Formula: see text]hmC sites. To build the model, 22 different classes of dinucleotide physicochemical (PC) properties were employed to represent RNA sequences, and the least absolute shrinkage and selection operator (LASSO) algorithm was adopted to select the most discriminative features. In the jackknife test, the proposed method achieved 89.80% classification accuracy based on support vector machine (SVM). As compared with the state-of-the-art predictors, our proposed method has significant improvement on the classification performance. It indicates that the proposed method might be a promising tool in identifying RNA 5[Formula: see text]hmC modification sites. The dataset and source codes are available at https://figshare.com/articles/online_resource/iRNA5hmC-HOC/15177450.
               
Click one of the above tabs to view related content.