The evolution of omics and computational competency has accelerated discoveries of the underlying biological processes in an unprecedented way. High throughput methodologies, such as flow cytometry, can reveal deeper insights… Click to show full abstract
The evolution of omics and computational competency has accelerated discoveries of the underlying biological processes in an unprecedented way. High throughput methodologies, such as flow cytometry, can reveal deeper insights into cell processes, thereby allowing opportunities for accelerated discoveries related to health and diseases. However, working with cytometry data often imposes complex computational challenges due to high-dimensionality, large size, and non-linearity of the data structure. In addition, cytometry data frequently exhibit diverse patterns across biomarkers and suffer from substantial class imbalances which can further complicate the problem. The existing methods of cytometry data analysis either predict cell population or perform feature selection. Through this study, we propose a "wisdom of the crowd" approach to simultaneously predict rare cell populations and perform feature selection by integrating a pool of modern machine learning algorithms. Given that our approach integrates superior performing machine learning models across different normalization techniques based on entropy and rank, our method can detect diverse patterns existing across the model features. Furthermore, the method identifies a dynamic biomarker structure that divides the features into persistently selected, unselected, and fluctuating assemblies indicating the role of each biomarker in rare cell prediction, which can subsequently aid in studies of disease progression. This article is protected by copyright. All rights reserved.
               
Click one of the above tabs to view related content.