MOTIVATION The T-cell receptor (TCR) is responsible for recognizing epitopes presented on cell surfaces. Linking TCR sequences to their ability to target specific epitopes is currently an unsolved problem, yet… Click to show full abstract
MOTIVATION The T-cell receptor (TCR) is responsible for recognizing epitopes presented on cell surfaces. Linking TCR sequences to their ability to target specific epitopes is currently an unsolved problem, yet one of great interest. Indeed, it is currently unknown how dissimilar TCR sequences can be before they no longer bind the same epitope. This question is confounded by the fact that there are many ways to define the similarity between two TCR sequences. Here we investigate both issues in the context of TCR sequence unsupervised clustering. RESULTS We provide an overview of the performance of various distance metrics on two large independent datasets with 412 and 2835 TCR sequences respectively. Our results confirm the presence of structural distinct TCR groups that target identical epitopes. In addition, we put forward several recommendations to perform unsupervised T-cell receptor sequence clustering. AVAILABILITY AND IMPLEMENTATION Source code implemented in Python 3 available at https://github.com/pmeysman/TCRclusteringPaper. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
               
Click one of the above tabs to view related content.