In the past two decades, Y chromosome data has been generated for human population genetic studies. These Y chromosome datasets were produced with various testing methods and markers, thus difficult… Click to show full abstract
In the past two decades, Y chromosome data has been generated for human population genetic studies. These Y chromosome datasets were produced with various testing methods and markers, thus difficult to combine them for a comprehensive analysis. In this study, we combine four human Y chromosomal datasets of Han, Tibetan, Hui, and Li ethnic groups. The dataset contains 27 microsatellites and 137 single nucleotide polymorphisms these populations share in common. We assembled a single dataset containing 2439 individuals from 25 nationwide populations in China. A systematic analysis of genetic distance and clustering was performed. To determine the gene flow of the studied population with worldwide populations, we modeled the ancestry informative markers. The reference panel was regarded as a mixture of South Asian (SAS), East Asian (EAS), European (EUR), African (AFR), and American (AMR) populations from 1000 Genomes data of Y chromosome using nonlinear dataâfitting. We then calculated the admixture proportion of these four studied populations with 26 worldwide populations. The results showed that the Han and Hui have great genetic affinity, and Hui is the most admixed ethnic group, with 61.53% EAS, 34.65% SAS, 1.91% AFR, 1.56% AMR, and 0.04% EUR ancestry component (the AMR is highly admixed and thus should be ignored). All the other three ethnic groups contained more than 97% EAS ancestry component. The Li is the least admixed population in this study. The combined dataset in this study is the largest of this kind reported to date and proposes reference population data for use in future paternal genetic studies and forensic genealogical identification.
               
Click one of the above tabs to view related content.