DNA-based ancestry inference has long been a research hot spot in forensic science. The differentiation of Han Chinese population, such as the northern-to-southern substructure, would benefit forensic practice. In the… Click to show full abstract
DNA-based ancestry inference has long been a research hot spot in forensic science. The differentiation of Han Chinese population, such as the northern-to-southern substructure, would benefit forensic practice. In the present study, we enrolled participants from northern and southern China, each participant was genotyped at ∼400 K single-nucleotide polymorphisms (SNPs) and data of CHB and CHS from 1000 Genomes Project were used to perform genome-wide association analyses. Meanwhile, a new method combining genome-wide association study (GWAS) analyses with k-fold cross-validation in a small sample size was introduced. As a result, one SNP rs17822931 emerged with a p-value of 7.51E - 6. We also simulated a huge dataset to verify whether k-fold cross-validation could reduce the false-negative rate of GWAS. The identified ABCC11 rs17822931 has been reported to have allele frequencies varied with the geographical gradient distribution in humans. We also found a great difference in the allele frequency distributions of rs17822931 among five different cohorts of the Chinese population. In conclusion, our study demonstrated that even small-scale GWAS can also have potential to identify effective loci with implemented k-fold cross-validation method and shed light on the potential maker of rs17822931 in differentiating the north-to-south substructure of the Han Chinese population.
               
Click one of the above tabs to view related content.