LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

TAMC: A deep-learning approach to predict motif-centric transcriptional factor binding activity based on ATAC-seq profile

Photo by campaign_creators from unsplash

Determining transcriptional factor binding sites (TFBSs) is critical for understanding the molecular mechanisms regulating gene expression in different biological conditions. Biological assays designed to directly mapping TFBSs require large sample… Click to show full abstract

Determining transcriptional factor binding sites (TFBSs) is critical for understanding the molecular mechanisms regulating gene expression in different biological conditions. Biological assays designed to directly mapping TFBSs require large sample size and intensive resources. As an alternative, ATAC-seq assay is simple to conduct and provides genomic cleavage profiles that contain rich information for imputing TFBSs indirectly. Previous footprint-based tools are inheritably limited by the accuracy of their bias correction algorithms and the efficiency of their feature extraction models. Here we introduce TAMC (Transcriptional factor binding prediction from ATAC-seq profile at Motif-predicted binding sites using Convolutional neural networks), a deep-learning approach for predicting motif-centric TF binding activity from paired-end ATAC-seq data. TAMC does not require bias correction during signal processing. By leveraging a onedimensional convolutional neural network (1D-CNN) model, TAMC captures both footprint and non-footprint features at binding sites for each TF and outperforms existing footprinting tools in TFBS prediction particularly for ATAC-seq data with limited sequencing depth. AUTHOR SUMMARY Applications of deep-learning models are rapidly gaining popularity in recent biological studies because of their efficiency in analyzing non-linear patterns from feature-rich data. In this study, we developed a 1D-CNN model to predict TFBSs from ATAC-seq data. Compared to previous models using scoring functions and classical machine learning algorithms, our 1D-CNN model forgoes the need for bias correction during signal processing and significantly increases the efficiency in extracting features for TFBS prediction. In addition, the performance of our 1D-CNN model improves when the sequencing depth of training ATAC-seq data increases. Importantly, we showed that our method outperforms existing tools in TFBS prediction particularly when the sequencing depth of training ATAC-seq data is higher than the ATAC-seq data for prediction. This widened the applicability of our model to ATAC-seq data with both deep and shallow sequencing depth. Based on these results, we discussed about the potential application of our method to TFBS predication using bulk and single-cell ATAC-seq data.

Keywords: seq data; transcriptional factor; atac seq; seq; deep learning; factor binding

Journal Title: PLoS Computational Biology
Year Published: 2022

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.