Adequate clinical evaluation of artificial intelligence (AI) algorithms before adoption in practice is critical. Clinical evaluation aims to confirm acceptable AI performance through adequate external testing and confirm the benefits… Click to show full abstract
Adequate clinical evaluation of artificial intelligence (AI) algorithms before adoption in practice is critical. Clinical evaluation aims to confirm acceptable AI performance through adequate external testing and confirm the benefits of AI-assisted care compared with conventional care through appropriately designed and conducted studies, for which prospective studies are desirable. This article explains some of the fundamental methodological points that should be considered when designing and appraising the clinical evaluation of AI algorithms for medical diagnosis. The specific topics addressed include the following: (a) the importance of external testing of AI algorithms and strategies for conducting the external testing effectively, (b) the various metrics and graphical methods for evaluating the AI performance as well as essential methodological points to note in using and interpreting them, (c) paired study designs primarily for comparative performance evaluation of conventional and AI-assisted diagnoses, (d) parallel study designs primarily for evaluating the effect of AI intervention with an emphasis on randomized clinical trials, and (e) up-to-date guidelines for reporting clinical studies on AI, with an emphasis on guidelines registered in the EQUATOR Network library. Sound methodological knowledge of these topics will aid the design, execution, reporting, and appraisal of clinical evaluation of AI.
               
Click one of the above tabs to view related content.