Differential analysis in proteomics is pivotal for biomarker discovery and disease mechanism elucidation, yet traditional statistical methods are constrained by distributional assumptions and empirical fold change threshold dependencies. This study… Click to show full abstract
Differential analysis in proteomics is pivotal for biomarker discovery and disease mechanism elucidation, yet traditional statistical methods are constrained by distributional assumptions and empirical fold change threshold dependencies. This study systematically evaluates 18 unsupervised anomaly detection machine learning (ML) algorithms against the established statistical frameworks for differential protein detection from proteomic data sets. Using in silico simulated data sets derived from experimental data, we enabled cross-algorithm comparability through a probability based transformation. Results demonstrated that ML methods, particularly the Minimum Covariance Determinant (MCD), outperformed statistical test in recall, precision, and accuracy, with superior robustness to intersample heterogeneity. Validation on real-world proteomic data further confirmed that the MCD-identified differentially expressed proteins comprehensively covered canonical pathways while uncovering novel tumor-associated functional biomolecules. This work establishes unsupervised ML methods as robust alternatives to traditional hypothesis-driven statistical approaches in proteomics differential analysis, offering enhanced reliability for precision medicine research.
               
Click one of the above tabs to view related content.