In the past years, the high-throughput sequencing technologies have enabled massive insights into genomic annotations. In contrast, the full-scale three-dimensional arrangements of genomic regions are relatively unknown. Thanks to the… Click to show full abstract
In the past years, the high-throughput sequencing technologies have enabled massive insights into genomic annotations. In contrast, the full-scale three-dimensional arrangements of genomic regions are relatively unknown. Thanks to the recent breakthroughs in High-throughput Chromosome Conformation Capture (Hi-C) techniques, non-negative matrix factorization (NMF) has been adopted to identify local spatial clusters of genomic regions from Hi-C data. However, such non-negative matrix factorization entails a high-dimensional non-convex objective function to be optimized with non-negative constraints. We propose and compare more than ten optimization algorithms to improve the identification of local spatial clusters via NMF. To circumvent and optimize the high-dimensional, non-convex, and constrained objective function, we draw inspiration from the nature to perform in silico evolution. The proposed algorithms consist of a population of candidates to be evolved while the NMF acts as local search during the evolutions. The population based optimization algorithm coordinates and guides the non-negative matrix factorization toward global optima. Experimental results show that the proposed algorithms can improve the quality of non-negative matrix factorization over the recent state-of-the-arts. The effectiveness and robustness of the proposed algorithms are supported by comprehensive performance benchmarking on chromosome-wide Hi-C contact maps of yeast and human. In addition, time complexity analysis, convergence analysis, parameter analysis, biological case studies, and gene ontology similarity analysis are conducted to demonstrate the robustness of the proposed methods from different perspectives.
               
Click one of the above tabs to view related content.