The analysis of target enrichment data in phylogenetics lacks optimization toward using paralogues for phylogenetic reconstruction. We developed a novel approach of detecting paralogues and utilizing them for phylogenetic tree… Click to show full abstract
The analysis of target enrichment data in phylogenetics lacks optimization toward using paralogues for phylogenetic reconstruction. We developed a novel approach of detecting paralogues and utilizing them for phylogenetic tree inference, by retrieving both ortho‐ and paralogous copies and creating orthologous alignments, from which the gene trees are built. We implemented this approach in ParalogWizard and demonstrate its performance in plant groups that underwent a whole genome duplication relatively recently: the subtribe Malinae (family Rosaceae), using Angiosperms353 as well as Malinae481 probes, the genus Oritrophium (family Asteraceae), using Compositae1061 probes, and the genus Amomum (family Zingiberaceae), using Zingiberaceae1180 probes. Discriminating between orthologues and paralogues reduced gene tree discordance and increased the species tree support in the case of the Malinae, but not for Oritrophium and Amomum. This may relate to the difference in the proportion of paralogous loci between the data sets, which was highest for the Malinae. Overall, retrieving paralogues for phylogenetic reconstruction following ParalogWizard has the potential to increase the species tree support and reduce gene tree discordance in target enrichment data, particularly if the proportion of paralogous loci is high.
               
Click one of the above tabs to view related content.