Structural variants and presence/absence polymorphisms are common in plant genomes, yet they are routinely overlooked in genome-wide association studies (GWAS). Here, we expand the type of genetic variants detected in… Click to show full abstract
Structural variants and presence/absence polymorphisms are common in plant genomes, yet they are routinely overlooked in genome-wide association studies (GWAS). Here, we expand the type of genetic variants detected in GWAS to include major deletions, insertions and rearrangements. We first use raw sequencing data directly to derive short sequences, k -mers, that mark a broad range of polymorphisms independently of a reference genome. We then link k -mers associated with phenotypes to specific genomic regions. Using this approach, we reanalyzed 2,000 traits in Arabidopsis thaliana , tomato and maize populations. Associations identified with k -mers recapitulate those found with SNPs, but with stronger statistical support. Importantly, we discovered new associations with structural variants and with regions missing from reference genomes. Our results demonstrate the power of performing GWAS before linking sequence reads to specific genomic regions, which allows the detection of a wider range of genetic variants responsible for phenotypic variation. Application of a new k -mer-based genome-wide association approach to 2,000 phenotypes in Arabidopsis thaliana , tomato and maize detects new associations with structural variants and with regions missing from reference genomes.
               
Click one of the above tabs to view related content.