With recent advances in next-generation sequencing (NGS) technology, large volumes of data have been produced in the form of short reads. Sequence assembly involves using initial short reads to produce… Click to show full abstract
With recent advances in next-generation sequencing (NGS) technology, large volumes of data have been produced in the form of short reads. Sequence assembly involves using initial short reads to produce progressively longer contigs, and then using scaffolds to produce the final sequence. These processes each require evaluation of the extent of homology between different sequences. However, because the NGS platforms currently being developed are diverse, and the data being produced are of different sizes and read lengths, numerous algorithms are being developed with unique methodologies to process this complex data. It is difficult for biologists to manipulate the different features involved in these algorithms. Therefore, to reduce experimental trial-and-error, different strategies are required depending on the performance and purpose of the optimal algorithm, thereby facilitating understanding of algorithm methodologies and effective use of their various features. This study is a review of the different short read alignment algorithms and NGS platforms that have been developed to date, in order to aid efficient selection of algorithms for reference sequences and mapping of DNA data.
               
Click one of the above tabs to view related content.