Summary Alternative splicing (AS) is a well-established mechanism for increasing transcriptome and proteome diversity, however, detecting AS events and distinguishing among AS types in organisms without available reference genomes remains… Click to show full abstract
Summary Alternative splicing (AS) is a well-established mechanism for increasing transcriptome and proteome diversity, however, detecting AS events and distinguishing among AS types in organisms without available reference genomes remains challenging. We developed a de novo approach called AStrap for AS analysis without using a reference genome. AStrap identifies AS events by extensive pair-wise alignments of transcript sequences and predicts AS types by a machine-learning model integrating more than 500 assembled features. We evaluated AStrap using collected AS events from reference genomes of rice and human as well as single-molecule real-time sequencing data from Amborella trichopoda. Results show that AStrap can identify much more AS events with comparable or higher accuracy than the competing method. AStrap also possesses a unique feature of predicting AS types, which achieves an overall accuracy of ∼0.87 for different species. Extensive evaluation of AStrap using different parameters, sample sizes and machine-learning models on different species also demonstrates the robustness and flexibility of AStrap. AStrap could be a valuable addition to the community for the study of AS in nonmodel organisms with limited genetic resources. Availability AStrap is available for download at https://github.com/BMILAB/AStrap. Supp. information Supplementary data are available at Bioinformatics online.
               
Click one of the above tabs to view related content.