Current methods for genome-wide analysis of gene expression require fragmentation of original transcripts into small fragments for short-read sequencing. In bacteria, the resulting fragmented information hides operon complexity. Additionally, in vivo… Click to show full abstract
Current methods for genome-wide analysis of gene expression require fragmentation of original transcripts into small fragments for short-read sequencing. In bacteria, the resulting fragmented information hides operon complexity. Additionally, in vivo processing of transcripts confounds the accurate identification of the 5′ and 3′ ends of operons. Here we develop a methodology called SMRT-Cappable-seq that combines the isolation of un-fragmented primary transcripts with single-molecule long read sequencing. Applied to E. coli, this technology results in an accurate definition of the transcriptome with 34% of known operons from RegulonDB being extended by at least one gene. Furthermore, 40% of transcription termination sites have read-through that alters the gene content of the operons. As a result, most of the bacterial genes are present in multiple operon variants reminiscent of eukaryotic splicing. By providing such granularity in the operon structure, this study represents an important resource for the study of prokaryotic gene network and regulation.Fragmentation of transcripts can hide operon complexity in genome-wide transcriptome sequencing. Here the authors sequenced a prokaryotic transcriptome using PacBio and provide insights into operon structure and read-through at termination sites.
               
Click one of the above tabs to view related content.