MOTIVATION The availability of thousands of genome-wide ChIP-Seq datasets across hundreds of transcriptional factors (TFs) and cell lines provides an unprecedented opportunity to jointly analyze large-scale transcription factor binding in… Click to show full abstract
MOTIVATION The availability of thousands of genome-wide ChIP-Seq datasets across hundreds of transcriptional factors (TFs) and cell lines provides an unprecedented opportunity to jointly analyze large-scale transcription factor binding in vivo, making possible the discovery of the potential interaction and cooperation among different TFs. The interacted and cooperated TFs can potentially form a transcriptional regulatory module (TRM) (e.g. co-binding TFs), which helps decipher the combinatorial regulatory mechanisms. RESULTS We develop a computational method tfLDA to apply state-of-the-art topic models to multiple ChIP-Seq datasets to decipher the combinatorial binding events of multiple TFs. tfLDA is able to learn high-order combinatorial binding patterns of TFs from multiple ChIP-Seq profiles, interpret and visualize the combinatorial patterns. We apply the tfLDA to two cell lines with a rich collection of TFs and identify combinatorial binding patterns that show well-known TRMs and related TF co-binding events. AVAILABILITY AND IMPLEMENTATION A software R package tfLDA is freely available at https://github.com/lichen-lab/tfLDA. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
               
Click one of the above tabs to view related content.