Background Theoretically, artificial intelligence can provide an accurate automatic solution to measure right ventricular (RV) ejection fraction (RVEF) from cardiovascular magnetic resonance (CMR) images, despite the complex RV geometry. However,… Click to show full abstract
Background Theoretically, artificial intelligence can provide an accurate automatic solution to measure right ventricular (RV) ejection fraction (RVEF) from cardiovascular magnetic resonance (CMR) images, despite the complex RV geometry. However, in our recent study, commercially available deep learning (DL) algorithms for RVEF quantification performed poorly in some patients. The current study was designed to test the hypothesis that quantification of RV function could be improved in these patients by using more diverse CMR datasets in addition to domain-specific quantitative performance evaluation metrics during the cross-validation phase of DL algorithm development. Methods We identified 100 patients from our prior study who had the largest differences between manually measured and automated RVEF values. Automated RVEF measurements were performed using the original version of the algorithm (DL1), an updated version (DL2) developed from a dataset that included a wider range of RV pathology and validated using multiple domain-specific quantitative performance evaluation metrics, and conventional methodology performed by a core laboratory (CORE). Each of the DL-RVEF approaches was compared against CORE-RVEF reference values using linear regression and Bland–Altman analyses. Additionally, RVEF values were classified into 3 categories: ≤ 35%, 35–50%, and ≥ 50%. Agreement between RVEF classifications made by the DL approaches and the CORE measurements was tested. Results CORE-RVEF and DL-RVEFs were obtained in all patients (feasibility of 100%). DL2-RVEF correlated with CORE-RVEF better than DL1-RVEF (r = 0.87 vs. r = 0.42), with narrower limits of agreement. As a result, DL2 algorithm also showed increasing accuracy from 0.53 to 0.80 for categorizing RV function. Conclusions The use of a new DL algorithm cross-validated on a dataset with a wide range of RV pathology using multiple domain-specific metrics resulted in a considerable improvement in the accuracy of automated RVEF measurements. This improvement was demonstrated in patients whose images were the most challenging and resulted in the largest RVEF errors. These findings underscore the critical importance of this strategy in the development of DL approaches for automated CMR measurements.
               
Click one of the above tabs to view related content.