PURPOSE State-of-the-art automated segmentation methods achieve exceptionally high performance on the Brain Tumor Segmentation (BraTS) challenge, a dataset of uniformly processed and standardized magnetic resonance generated images (MRIs) of gliomas.… Click to show full abstract
PURPOSE State-of-the-art automated segmentation methods achieve exceptionally high performance on the Brain Tumor Segmentation (BraTS) challenge, a dataset of uniformly processed and standardized magnetic resonance generated images (MRIs) of gliomas. However, a reasonable concern is that these models may not fare well on clinical MRIs that do not belong to the specially curated BraTS dataset. Research using the previous generation of deep learning models indicates significant performance loss on cross-institutional predictions. Here, we evaluate the cross-institutional applicability and generalisability of state-of-the-art deep learning models on new clinical data. Methods We train a state-of-the-art 3D U-Net model on the conventional BraTS dataset comprising low- and high-grade gliomas. We then evaluate the performance of this model for automatic tumor segmentation of brain tumors on in-house clinical data. This dataset contains MRIs of different tumor types, resolutions, and standardization than those found in the BraTS dataset. Ground truth segmentations to validate the automated segmentation for in-house clinical data were obtained from expert radiation oncologists. RESULTS We report average Dice scores of 0.764, 0.648, and 0.61 for the whole tumor, tumor core, and enhancing tumor, respectively, in the clinical MRIs. These means are higher than numbers reported previously on the same institution and cross-institution datasets of different origin using different methods. There is no statistical difference (p > 0.1) when comparing the dice scores to the inter-annotation variability between two expert clinical radiation oncologists. Though the clinical data performance is lower than the BraTS dataset, these numbers indicate that models trained on the BraTS dataset have impressive segmentation performance on previously unseen images obtained at a separate clinical institution. These images differ in the imaging resolutions, standardization pipelines, and tumor types from the BraTS data. CONCLUSIONS State-of-the-art deep learning models demonstrate promising performance on cross-institutional predictions. They significantly improve on previous models and can transfer knowledge to new types of brain tumors without additional modeling. This article is protected by copyright. All rights reserved.
               
Click one of the above tabs to view related content.