The inability to accurately, efficiently label large, open-access medical imaging datasets limits the widespread implementation of artificial intelligence models in healthcare. There have been few attempts, however, to automate the… Click to show full abstract
The inability to accurately, efficiently label large, open-access medical imaging datasets limits the widespread implementation of artificial intelligence models in healthcare. There have been few attempts, however, to automate the annotation of such public databases; one approach, for example, focused on labor-intensive, manual labeling of subsets of these datasets to be used to train new models. In this study, we describe a method for standardized, automated labeling based on similarity to a previously validated, explainable AI (xAI) model-derived-atlas, for which the user can specify a quantitative threshold for a desired level of accuracy (the probability-of-similarity, pSim metric). We show that our xAI model, by calculating the pSim values for each clinical output label based on comparison to its training-set derived reference atlas, can automatically label the external datasets to a user-selected, high level of accuracy, equaling or exceeding that of human experts. We additionally show that, by fine-tuning the original model using the automatically labelled exams for retraining, performance can be preserved or improved, resulting in a highly accurate, more generalized model. Here the authors develop a method for accurate auto-labelling of CXR images from large public datasets based on quantitative probability-of similarity to an explainable AI model. The labels can be used to fine-tune the original model through iterative re-training.
               
Click one of the above tabs to view related content.