LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Trustworthiness, the Key to Grid-Based Map-Driven Predictive Model Enhancement and Applicability Domain Control

Photo from wikipedia

In chemography, grid-based maps sample molecular descriptor space by injecting a set of nodes, and then linking them to some regular 2D grid representing the map. They include self-organizing maps… Click to show full abstract

In chemography, grid-based maps sample molecular descriptor space by injecting a set of nodes, and then linking them to some regular 2D grid representing the map. They include self-organizing maps (SOMs) and generative topographic maps (GTMs). Grid-based maps are predictive because any compound thereupon projected can "inherit" the properties of its residence node(s)-node properties themselves "inherited" from node-neighboring training set compounds. This Article proposes a formalism to define the trustworthiness of these nodes as "providers" of structure-activity information captured from training compounds. An empirical four-parameter node trustworthiness (NT) function of density (sparsely populated nodes are less trustworthy) and coherence (nodes with training set residents of divergent properties are less trustworthy) is proposed. Based upon it, a trustworthiness score T is used to delimit the applicability domain (AD) by means of a trustworthiness threshold TT. For each parameter setup, success of ensuing inside-AD predictions is monitored. It is seen that setup-specific success levels (averaged over large pools of prediction challenges) are highly covariant, irrespectively of the targets of prediction challenges, of the (classification or regression) type of problems, of the specific parametrization, and even of the nature (GTM or SOM) of underlying maps. Thus, success levels determined on the basis of regression problems (445 target-specific affinity QSAR sets) on GTMs and levels returned by completely unrelated classification problems (319 target-specific active-/inactive-labeled sets) on SOMs were seen to correlate to a degree of 70%. Therefore, a common, general-purpose setup of the herein proposed parametric AD definition was shown to generally apply to grid-based map-driven property prediction problems.

Keywords: applicability domain; trustworthiness; based map; map driven; grid based

Journal Title: Journal of chemical information and modeling
Year Published: 2020

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.