"AGILE: an assembled genome mining pipeline"

SUMMARY A number of limiting factors mean that traditional genome annotation tools either fail or perform sub-optimally when trying to detect coding sequences in poor quality genome assemblies/genome reports. This means that potentially useful data is accessible only to those with specific skills and expertise in assembly and annotation. We present an Assembled-Genome mIning pipeLinE (AGILE) written in Perl that combines bioinformatics tools with a number of steps to overcome the limitations imposed by such assemblies when applied to highly fragmented genomes. Our methodology uses user-specified query genes from a closely related species to mine and annotate coding sequences that would traditionally be missed by standard annotation packages. Despite a focus on mammalian genomes, the generalized implementation means that it may be applied to any genome assembly, providing a means for non-specialists to gather gene sequences for downstream analyses. AVAILABILITY AND IMPLEMENTATION Source code and associated files are available at: https://github.com/batlabucd/GenomeMining and https://bitbucket.org/BatlabUCD/genomemining/src. Singularity and Virtual Box images available at https://figshare.com/s/a0004bf93dc43484b0c0. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

Keywords: mining pipeline; pipeline agile; assembled genome; genome mining

Journal Title: Bioinformatics
Year Published: 2019

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
1

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended