Computational models for predicting the activity of small molecules against targets are now routinely developed and used in academia and industry, partially due to public bioactivity databases. While models based… Click to show full abstract
Computational models for predicting the activity of small molecules against targets are now routinely developed and used in academia and industry, partially due to public bioactivity databases. While models based on bigger datasets are the trend, recent studies such as chemogenomic active learning have shown that only a fraction of data is needed for effective models in many cases. In this article, the chemogenomic active learning method is discussed and used to newly analyze public databases containing nuclear hormone receptor and cytochrome P450 enzyme family bioactivity. In addition to existing results on kinases and G‐protein coupled receptors, results here demonstrate the active learning methodology's effectiveness on extracting informative ligand–target pairs in sparse data scenarios. Experiments to assess the domain of the applicability demonstrate the influence of ligand profiles of similar targets within the family.
               
Click one of the above tabs to view related content.