Motivation: The bulk of space taken up by NGS sequencing CRAM files consists of per‐base quality values. Most of these are unnecessary for variant calling, offering an opportunity for space… Click to show full abstract
Motivation: The bulk of space taken up by NGS sequencing CRAM files consists of per‐base quality values. Most of these are unnecessary for variant calling, offering an opportunity for space saving. Results: On the Syndip test set, a 17 fold reduction in the quality storage portion of a CRAM file can be achieved while maintaining variant calling accuracy. The size reduction of an entire CRAM file varied from 2.2 to 7.4 fold, depending on the non‐quality content of the original file (see Supplementary Material S6 for details). Availability and implementation: Crumble is OpenSource and can be obtained from https://github.com/jkbonfield/crumble. Supplementary information: Supplementary data are available at Bioinformatics online.
               
Click one of the above tabs to view related content.