LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Prioritizing tests of epistasis through hierarchical representation of genomic redundancies

Photo by nci from unsplash

Abstract Epistasis is defined as a statistical interaction between two or more genomic loci in terms of their association with a phenotype of interest. Epistatic loci that are identified using… Click to show full abstract

Abstract Epistasis is defined as a statistical interaction between two or more genomic loci in terms of their association with a phenotype of interest. Epistatic loci that are identified using data from Genome-Wide Association Studies (GWAS) provide insights into the interplay among multiple genetic factors, with applications including assessment of susceptibility to complex diseases, decision making in precision medicine, and gaining insights into disease mechanisms. Since the number of genomic loci assayed by GWAS is extremely large (usually in the order of millions), identification of epistatic loci is a statistically difficult and computationally intensive problem. Even when only pairwise interactions are considered, the size of the search space ranges from hundreds of millions to billions of locus pairs. The large number of statistical tests performed also makes sufficient type one error correction imperative. Consequently, efficient algorithms are required to filter the tests that are performed and evaluate large GWAS data sets in a reasonable amount of computation time. It has been observed that many pairwise tests are redundant due to correlations in their genotype values across samples, known as linkage disequilibrium. However, algorithms that have been developed for efficient identification of epistatic loci do not systematically exploit linkage disequilibrium. Here, we propose a new algorithm for fast epistasis detection based on hierarchical representation of linkage disequilibrium (LinDen). We utilize redundancies in genotype patterns between neighboring loci to generate a hierarchical structure and execute a branch-and-bound search to prioritize loci testing based on approximations of a test statistic for pairs of locus groups. The hierarchical organization of tests performed by LinDen allows for efficient scaling based on the screened loci. We test LinDen comprehensively on three data sets obtained from the Wellcome Trust Case Control Consortium: type two diabetes, psoriasis, and hypertension. Our results show that, as compared other state-of-the-art tools for fast epistasis detection, LinDen drastically reduces the number of tests performed while discovering statistically significant locus pairs. LinDen is implemented in C++ and is available as open source at http://compbio.case.edu/linden/.

Keywords: loci; tests performed; hierarchical representation; epistatic loci; linkage disequilibrium

Journal Title: Nucleic Acids Research
Year Published: 2017

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.