Motivation Random forests (RF) are fast, flexible and have become a standard tool in bioinformatics, particularly because they provide variable importance measures (VIM), which can be used to identify relevant… Click to show full abstract
Motivation Random forests (RF) are fast, flexible and have become a standard tool in bioinformatics, particularly because they provide variable importance measures (VIM), which can be used to identify relevant features or perform variable selection. A recent study uses RF original implementation to propose a new VIM in order to identify relevant Gene Ontology terms and compares them to another recently proposed VIM called intervention in prediction measure (IPM) as a strong baseline. However, theres is still little knowledge on how the IPM performs, especially in high-dimensional scenarios. Simulations show that the IPM can produce results as biased as the Gini impurity VIM and can produce spurious results. The bias can be reduced if RF are build using more recent and unbiased splitting criteria. Supplementary information Supplementary data are available at Bioinformatics online.
               
Click one of the above tabs to view related content.