This paper presents an exhaustive empirical study to identify biomarkers using two approaches: frequency-based and network-based, over 17 different biclustering algorithms and six different cancer expression datasets. To systematically analyze… Click to show full abstract
This paper presents an exhaustive empirical study to identify biomarkers using two approaches: frequency-based and network-based, over 17 different biclustering algorithms and six different cancer expression datasets. To systematically analyze the biclustering algorithms, we perform enrichment analysis, subtype identification, and biomarker identification. Biclustering algorithms such as C&C, SAMBA, and Plaid are useful to detect biomarkers by both approaches for all datasets except prostate cancer. We detect a total of 103 gene biomarkers using frequency-based method out of which 19 are for blood cancer, 36 for lung cancer, 25 for colon cancer, 13 for multi-tissue cancer, and 10 for prostate cancer. Using the network-based approach, we detect a total of 41 gene biomarkers of which 15 are from blood cancer, 12 from lung cancer, 6 from colon cancer, 7 from multi-tissue cancer, and 1 from prostate cancer dataset. We further extend our network analysis over some biclusters and detect some gene biomarkers not detected earlier by both frequency-based or network-based approach. We expand our work on breast cancer miRNA expression data to evaluate the performance of the biclustering algorithms. We detect 19 breast cancer biomarkers by frequency-based method and 5 by network-based method for the miRNA dataset.
               
Click one of the above tabs to view related content.