MOTIVATION Gene-centric bioinformatics studies frequently involve calculation or extraction of various features of genes such as splice sites, promoters, independent introns, and untranslated regions (UTRs) through manipulation of gene models.… Click to show full abstract
MOTIVATION Gene-centric bioinformatics studies frequently involve calculation or extraction of various features of genes such as splice sites, promoters, independent introns, and untranslated regions (UTRs) through manipulation of gene models. Gene models are often annotated in gene transfer format (GTF) files. The features are essential for subsequent analysis such as intron retention detection, DNA-binding site identification, and computing splicing strength of splice sites. Some features such as independent introns and splice sites are not provided in existing resources including the commonly used BioMart database. A package that implements and integrates functions to analyze various features of genes will greatly ease routine analysis for related bioinformatics studies. However, to the best of our knowledge, such a package is not available yet. RESULTS In this work, we introduce GTFtools, a stand-alone command-line software that provides a set of functions to calculate various gene features, including splice sites, independent introns, transcription start sites (TSS)-flanking regions, UTRs, isoform coordination and length, different types of gene lengths, etc. It takes the ENSEMBL or GENCODE GTF files as input, and can be applied to both human and non-human gene models like the lab mouse. We compare the utilities of GTFtools with those of two related tools: Bedtools and BioMart. GTFtools is implemented in Python and not dependent on any third-party software, making it very easy to install and use. AVAILABILITY GTFtools is freely available at www.genemine.org/gtftools.php as well as pyPI and Bioconda.
               
Click one of the above tabs to view related content.