There is significant interest in developing machine learning methods to model protein-ligand interactions but a scarcity of experimentally resolved protein-ligand structures to learn from. Protein self-contacts are a much larger… Click to show full abstract
There is significant interest in developing machine learning methods to model protein-ligand interactions but a scarcity of experimentally resolved protein-ligand structures to learn from. Protein self-contacts are a much larger source of structural data that could be leveraged, but currently it is not well understood how this data source differs from the target domain. Here, we characterize the 3D geometric patterns of protein self-contacts as probability distributions. We then present a flexible statistical framework to assess the transferability of these patterns to protein-ligand contacts. We observe that the level of transferability from protein self-contacts to protein-ligand contacts depends on contact type, with many contact types exhibiting high transferability. We then demonstrate the potential of leveraging information from these geometric patterns to aid in ligand pose-selection problems in protein-ligand docking. We publicly release our extracted data on geometric interaction patterns to enable further exploration of this problem.
               
Click one of the above tabs to view related content.