The minimizing-deep-coalescence (MDC) approach infers a median (species) tree for a given set of gene trees under the deep coalescence cost. This cost accounts for the minimum number of deep… Click to show full abstract
The minimizing-deep-coalescence (MDC) approach infers a median (species) tree for a given set of gene trees under the deep coalescence cost. This cost accounts for the minimum number of deep coalescences needed to reconcile a gene tree with a species tree where the leaf-genes are mapped to the leaf-species through a function called leaf labeling. In order to better understand the MDC approach we investigate here the diameter of a gene tree, which is an important property of the deep coalescence cost. This diameter is the maximal deep coalescence costs for a given gene tree under all leaf labelings for each possible species tree topology. While we prove that this diameter is generally infinite, this result relies on the diameter’s unrealistic assumption that species trees can be of infinite size. Providing a more practical definition, we introduce a natural extension of the gene tree diameter that constrains the species tree size by a given constant. For this new diameter, we describe an exact formula, present a complete classification of the trees yielding this diameter, derive formulas for its mean and variance, and demonstrate its ability using comparative studies.
               
Click one of the above tabs to view related content.