"Can We Quickly Learn to "Translate" Bioactive Molecules with Transformer Models?"

Meaningful exploration of the chemical space of druglike molecules in drug design is a highly challenging task due to a combinatorial explosion of possible modifications of molecules. In this work, we address this problem with transformer models, a type of machine learning (ML) model originally developed for machine translation. By training transformer models on pairs of similar bioactive molecules from the public ChEMBL data set, we enable them to learn medicinal-chemistry-meaningful, context-dependent transformations of molecules, including those absent from the training set. By retrospective analysis on the performance of transformer models on ChEMBL subsets of ligands binding to COX2, DRD2, or HERG protein targets, we demonstrate that the models can generate structures identical or highly similar to most active ligands, despite the models having not seen any ligands active against the corresponding protein target during training. Our work demonstrates that human experts working on hit expansion in drug design can easily and quickly employ transformer models, originally developed to translate texts from one natural language to another, to "translate" from known molecules active against a given protein target to novel molecules active against the same target.

Keywords: learn translate; translate bioactive; transformer models; molecules transformer; bioactive molecules; quickly learn

Journal Title: Journal of chemical information and modeling
Year Published: 2023

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
3

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended