INTRODUCTION RadExam is a question item and exam database jointly developed by the Association of Program Directors in Radiology and the American College of Radiology to provide formative resident assessment,… Click to show full abstract
INTRODUCTION RadExam is a question item and exam database jointly developed by the Association of Program Directors in Radiology and the American College of Radiology to provide formative resident assessment, offering performance metrics benchmarked against institutional and national resident performance. Beyond resident performance, data is available on question and exam performance. Despite considerable investment in the education and training of its question writers and editors and meticulous attention to current psychometrically validated methods, it was anticipated a minority of exam questions would still perform poorly. Audits were performed to identify these questions, identify reasons for poor performance, and modify or replace so-affected questions. Exam performance was also assessed. METHODS Two audits were performed, the first after the February-May 2018 RadExam pilot phase, and the second nearly 1 year after the full implementation of RadExam. In each audit, RadExam subspecialty editors evaluated all exam questions and exams using statistical data: question and test number of administrations, question p value, question Discrimination Index (DI), question Bloom's taxonomy learning level, exam P-value, and the number of image-based questions in each exam. Identified questions were modified or removed and replaced. RESULTS Audit 1 was performed after the administration of 3114 exams comprised of 2520 questions administered across 100 residency programs. Audit 1 identified 617 questions with DI <0.1 and 565 questions with unacceptable P-values, all of which were modified or replaced. Audit 2 was performed after the administration of 16,416 exams, comprised of 2,507 questions. Audit 2 identified 229 questions with DI <0.1 and 290 questions with unacceptable P-values, representing a 49.1% decrease in total flagged questions compared to Audit 1. Statistically significant decreases were seen in questions with both DI and P-values outside of the desired range across nearly all subspecialties. CONCLUSION The positive impact of our audit system on question and exam performance was reflected in a significant decrease in the number of questions flagged and improved overall exam performance in Audit 2. This illustrates the positive impact of Audit 1.
               
Click one of the above tabs to view related content.