Key messageWe propose a novel computational method for genomic selection that combines identical-by-state (IBS)-based Haseman–Elston (HE) regression and best linear prediction (BLP), called HE-BLP.AbstractGenomic best linear unbiased prediction (GBLUP) has… Click to show full abstract
Key messageWe propose a novel computational method for genomic selection that combines identical-by-state (IBS)-based Haseman–Elston (HE) regression and best linear prediction (BLP), called HE-BLP.AbstractGenomic best linear unbiased prediction (GBLUP) has been widely used in whole-genome prediction for breeding programs. To determine the total genetic variance of a training population, a linear mixed model (LMM) should be solved via restricted maximum likelihood (REML), whose computational complexity is the cube of the sample size. We proposed a novel computational method combining identical-by-state (IBS)-based Haseman–Elston (HE) regression and best linear prediction (BLP), called HE-BLP. With this method, the total genetic variance can be estimated by solving a simple HE linear regression, which has a computational complex of the sample size squared; therefore, it is suitable for large-scale genomic data, except those with which environmental effects need to be estimated simultaneously, because it does not allow for this estimation. In Monte Carlo simulation studies, the estimated heritability based on HE was identical to that based on REML, and the prediction accuracy via HE-BLP and traditional GBLUP was also quite similar when quantitative trait loci (QTLs) were randomly distributed along the genome and their effects followed a normal distribution. In addition, the kernel row number (KRN) trait in a maize IBM population was used to evaluate the performance of the two methods; the results showed similar prediction accuracy of breeding values despite slightly different estimated heritability via HE and REML, probably due to the underlying genetic architecture. HE-BLP can be a future genomic selection method choice for even larger sets of genomic data in certain special cases where environmental effects can be ignored. The software for HE regression and the simulation program is available online in the Genetic Analysis Repository (GEAR; https://github.com/gc5k/GEAR/wiki).
               
Click one of the above tabs to view related content.