The recent extensive application of next-generation sequencing has led to the rapid accumulation of multiple types of data for functional DNA elements. With the advent of precision medicine, the fine-mapping… Click to show full abstract
The recent extensive application of next-generation sequencing has led to the rapid accumulation of multiple types of data for functional DNA elements. With the advent of precision medicine, the fine-mapping of risk loci based on these elements has become of paramount importance. In this study, we obtained the human reference genome (GRCh38) and the main DNA sequence elements, including protein-coding genes, miRNAs, lncRNAs and single nucleotide polymorphism flanking sequences, from different repositories. We then realigned these elements to identify their exact locations on the genome. Overall, 5%-20% of all sequence element locations deviated among databases, on the scale of kilobase-pair to megabase-pair. These deviations even affected the selection of genome-wide association study risk-associated genes. Our results implied that the location information for functional DNA elements may deviate among public databases. Researchers should take care when using cross-database sources and should perform pilot sequence alignments before element location-based studies.
               
Click one of the above tabs to view related content.