Low hemolytic therapeutic peptides have gained an edge over small molecule-based medicines. However, finding low hemolytic peptides in laboratory is time-consuming, costly and necessitates the use of mammalian red blood… Click to show full abstract
Low hemolytic therapeutic peptides have gained an edge over small molecule-based medicines. However, finding low hemolytic peptides in laboratory is time-consuming, costly and necessitates the use of mammalian red blood cells. Therefore, wet-lab researchers often perform in-silico prediction to select low hemolytic peptides before proceeding with in-vitro testing. The in-silico tools available for this purpose have following limitations: (i) They do not provide predictions for peptides having N/C terminal modifications. (ii) Data is food for AI; however, datasets used to create existing tools do not contain peptide data generated over past eight years. (iii) Performance of available tools is also low. Therefore, a novel framework has been proposed in current work. Proposed framework utilizes recent dataset and uses ensemble learning technique to combine the decisions produced by bidirectional long short-term memory, bidirectional temporal convolutional network, and 1-dimensional convolutional neural network deep learning algorithms. Deep learning algorithms are capable of extracting features themselves from data. However, instead of relying solely on deep learning-based features (DLF), handcrafted features (HCF) were also provided so that deep learning algorithms can learn features that are missing from HCF, and a better feature vector can be constructed by concatenating HCF and DLF. Additionally, ablation studies were carried out to understand the roles of an ensemble algorithm, HCF, and DLF in the proposed framework. Ablation studies found that the ensemble algorithm, HCF and DLF are crucial components of proposed framework, and there is a decrease in performance on eliminating any of them. Mean value of performance metrics, namely Acc, Sn, Pr, Fs, Sp, Ba, and Mcc obtained by proposed framework for test data is ≈ 87, 85, 86, 86, 88, 87, and 73, respectively. To aid scientific community, model developed from proposed framework has been deployed as a web server at https://endl-hemolyt.anvil.app/.
               
Click one of the above tabs to view related content.