LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Effect of various sequence descriptors in predicting human protein-protein interactions using ANN-based prediction models

Photo from wikipedia

A number of sequence-based descriptors for proteins have been proposed by many researchers. The aim is to evaluate the performance of these descriptors in predicting protein-protein interactions on the benchmark… Click to show full abstract

A number of sequence-based descriptors for proteins have been proposed by many researchers. The aim is to evaluate the performance of these descriptors in predicting protein-protein interactions on the benchmark dataset. The behavior of a protein inside or outside the cell is defined by its interactions with the elements present in the surrounding environment, which include small metabolites to the macromolecules such as RNA, DNA, or proteins. Of these, understanding protein-protein interactions (PPIs) is one of the important aspects to investigate the biological role of a protein. The interactions of a protein are determined by how it folds in 3-dimensional space, and this three dimensional folding of a protein largely depends on the linear sequence of amino acids. This information makes it possible to exploit the sequences to proteins to computationally determine the possible interactions among them. To study the efficacy of various sequence-based descriptors in predicting protein-protein interactions. In this study, we have used the benchmark dataset of interacting and non-interacting protein pairs provided by Pan et al. to build the PPI prediction models using artificial neural networks. We have compared the efficacy of different descriptors on two types of datasets one with all the protein pairs and second with protein with less than 25% identity. The results show that conjoint-triad descriptors performed better than other descriptors in predicting PPIs. The feature selection on the conjoint triad was performed and the effect on prediction model with reduced features versus all feature set is studied. The classification model with Conjoint-triad descriptors obtained the highest accuracy. The feature ranking for conjoint triad descriptor was performed and the model performance was compared with all and selected features. The model with reduced features shows less overfitting.

Keywords: protein; protein interactions; descriptors predicting; protein protein; sequence; prediction

Journal Title: Current Bioinformatics
Year Published: 2021

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.