Significance Homolog detection, finding similar proteins to an unknown protein, is usually the first step in understanding the role and function of that protein. However, if the identity of protein… Click to show full abstract
Significance Homolog detection, finding similar proteins to an unknown protein, is usually the first step in understanding the role and function of that protein. However, if the identity of protein sequences between query and target proteins is low (< 30%), traditional tools struggle to distinguish a correct match from a random one, failing to identify important similarities. We have used protein representations from deep learning language models to solve this problem. Reducing the size of these representations significantly improved homolog detection capabilities. Our tool can find putative homologs for more than 93% of human proteins that were not able to assign a function as of March 2022.
               
Click one of the above tabs to view related content.