In bioinformatics applications, it is currently customary to permute the outcome variable in order to produce inference on covariates to test novel methods or statistics whose distributions are poorly known.… Click to show full abstract
In bioinformatics applications, it is currently customary to permute the outcome variable in order to produce inference on covariates to test novel methods or statistics whose distributions are poorly known. The seminal publication of Altmann et al. (2010) in Bioinformatics uses the same permutation scheme to obtain p-values that can be treated as corrected measure of feature importance to rectify the bias of the Gini variable importance in Random Forests. Since then, such method has been used in applied work to also draw statistical conclusions on VIMs from resulting p-values. In this paper, we show that permuting the outcome may produce unexpected results, including p-values with undesirable properties and illustrate how more refined permutation schemes can be appropriate to obtain desirable results, including high power in discovering relevant variables. Supplementary Information Supplementary data are available at Bioinformatics online.
               
Click one of the above tabs to view related content.