The stability of proteins is an essential property that has several biological implications. Knowledge about protein stability is important in many ways, ranging from protein purification and structure determination to… Click to show full abstract
The stability of proteins is an essential property that has several biological implications. Knowledge about protein stability is important in many ways, ranging from protein purification and structure determination to stability in cells and biotechnological applications. Experimental determination of thermal stabilities has been tedious and available data have been limited. The introduction of limited proteolysis and mass spectrometry approaches has facilitated more extensive cellular protein stability data production. We collected melting temperature information for 34,913 proteins and developed a machine learning predictor, ProTstab2, by utilizing a gradient boosting algorithm after testing seven algorithms. The method performance was assessed on a blind test data set and showed a Pearson correlation coefficient of 0.753 and root mean square error of 7.005. Comparison to previous methods indicated that ProTstab2 had superior performance. The method is fast, so it was applied to predict and compare the stabilities of all proteins in human, mouse, and zebrafish proteomes for which experimental data were not determined. The tool is freely available.
               
Click one of the above tabs to view related content.