Significance Rooted binary trees inferred from molecular sequence data provide information about the evolutionary history of populations and species. We introduce metrics on ranked tree shapes and ranked genealogies, in… Click to show full abstract
Significance Rooted binary trees inferred from molecular sequence data provide information about the evolutionary history of populations and species. We introduce metrics on ranked tree shapes and ranked genealogies, in which the shape and temporal branching order in a tree are considered, but not the taxon labels. Our metrics enable quantification of evolutionary differences, assessment of tree uncertainty, and construction of statistical summaries of a tree distribution. They are computationally efficient and particularly useful for comparing phylodynamics of infectious diseases involving heterochronous samples and for comparative analyses of organisms that live in different geographic regions. Genealogical tree modeling is essential for estimating evolutionary parameters in population genetics and phylogenetics. Recent mathematical results concerning ranked genealogies without leaf labels unlock opportunities in the analysis of evolutionary trees. In particular, comparisons between ranked genealogies facilitate the study of evolutionary processes of different organisms sampled at multiple time periods. We propose metrics on ranked tree shapes and ranked genealogies for lineages isochronously and heterochronously sampled. Our proposed tree metrics make it possible to conduct statistical analyses of ranked tree shapes and timed ranked tree shapes or ranked genealogies. Such analyses allow us to assess differences in tree distributions, quantify estimation uncertainty, and summarize tree distributions. We show the utility of our metrics via simulations and an application in infectious diseases.
               
Click one of the above tabs to view related content.