Objective This study aimed to develop effective artificial intelligence (AI) diagnostic models based on CT images of pulmonary nodules only, on descriptional and quantitative clinical or image features, or on… Click to show full abstract
Objective This study aimed to develop effective artificial intelligence (AI) diagnostic models based on CT images of pulmonary nodules only, on descriptional and quantitative clinical or image features, or on a combination of both to differentiate benign and malignant ground-glass nodules (GGNs) to assist in the determination of surgical intervention. Methods Our study included a total of 867 nodules (benign nodules: 112; malignant nodules: 755) with postoperative pathological diagnoses from two centers. For the diagnostic models to discriminate between benign and malignant GGNs, we adopted three different artificial intelligence (AI) approaches: a) an image-based deep learning approach to build a deep neural network (DNN); b) a clinical feature-based machine learning approach based on the clinical and image features of nodules; c) a fusion diagnostic model integrating the original images and the clinical and image features. The performance of the models was evaluated on an internal test dataset (the “Changzheng Dataset”) and an independent test dataset collected from an external institute (the “Longyan Dataset”). In addition, the performance of automatic diagnostic models was compared with that of manual evaluations by two radiologists on the ‘Longyan dataset’. Results The image-based deep learning model achieved an appealing diagnostic performance, yielding AUC values of 0.75 (95% confidence interval [CI]: 0.62, 0.89) and 0.76 (95% CI: 0.61, 0.90), respectively, on both the Changzheng and Longyan datasets. The clinical feature-based machine learning model performed well on the Changzheng dataset (AUC, 0.80 [95% CI: 0.64, 0.96]), whereas it performed poorly on the Longyan dataset (AUC, 0.62 [95% CI: 0.42, 0.83]). The fusion diagnostic model achieved the best performance on both the Changzheng dataset (AUC, 0.82 [95% CI: 0.71-0.93]) and the Longyan dataset (AUC, 0.83 [95% CI: 0.70-0.96]), and it achieved a better specificity (0.69) than the radiologists (0.33-0.44) on the Longyan dataset. Conclusion The deep learning models, including both the image-based deep learning model and the fusion model, have the ability to assist radiologists in differentiating between benign and malignant nodules for the precise management of patients with GGNs.
               
Click one of the above tabs to view related content.