LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Bias in the intervention in prediction measure in random forests: illustrations and recommendations

Photo from wikipedia

Motivation Random forests (RF) are fast, flexible and have become a standard tool in bioinformatics, particularly because they provide variable importance measures (VIM), which can be used to identify relevant… Click to show full abstract

Motivation Random forests (RF) are fast, flexible and have become a standard tool in bioinformatics, particularly because they provide variable importance measures (VIM), which can be used to identify relevant features or perform variable selection. A recent study uses RF original implementation to propose a new VIM in order to identify relevant Gene Ontology terms and compares them to another recently proposed VIM called intervention in prediction measure (IPM) as a strong baseline. However, theres is still little knowledge on how the IPM performs, especially in high-dimensional scenarios. Simulations show that the IPM can produce results as biased as the Gini impurity VIM and can produce spurious results. The bias can be reduced if RF are build using more recent and unbiased splitting criteria. Supplementary information Supplementary data are available at Bioinformatics online.

Keywords: random forests; prediction measure; intervention prediction

Journal Title: Bioinformatics
Year Published: 2019

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.