Brain tumors pose a significant social and economic burden worldwide. A key to
The data was sourced from the BraTS 2018 dataset, consisting of 285 pre-operative scans with T1, T1-Gd, T2 and T2-FLAIR modalities and manual segmentation of the enhancing tumor, the peritumoral edema, and the necrotic and non-enhancing tumor5.
The proposed architecture, named ContextNet (fig. 1), is based on the popular network U-Net6 and includes residual units introduced in the work of He et al.7, which enable the creation of deeper networks and facilitate the possibility of passing low-level representations throughout the network. This architecture includes a novel building block, the GPCm, inspired by the work of Peng et al.8. The GPCm consists of the summation of three 2D convolutions performed in each of the orthogonal planes of the input volume (fig. 2). By constraining the convolution to a single plane, we can increase the kernel size, thus enlarge the effective receptive field of the network. This increased connectivity aids in the identification aspect of segmentation (what), while the residual elements and skip-connections contribute to the localization aspect of segmentation (where).
In our experiments, we used GPCm with kernels of size 15, chosen heuristically as the largest kernels that did not substantially compromise computational and memory requirements.
Figure 3 shows a comparison between a reference model without the GPCm, denoted as ResUNet, and variations of ContextNet with a different number of representation levels (i.e, a level consisting of operations that are performed at the same spatial resolution). Both ContextNet models with the reduced number of representation levels match or even surpass the performance of the ResUNet model when segmenting the whole tumor and the enhancing tumor. We hypothesize that the GPCm enables the aggregation of contextual information without the need of obtaining a deep representation via several pooling operations, which addresses the identification aspect of the segmentation task and reduces the complexity of the network. However, such complexity reduction results in more complex structures, such as the necrotic and non-enhancing tissue, to be more challenging for the segmentation due to the lack of model capacity.
Figure 4 depicts the feature maps extracted from the residual layers and GPC modules on the ContextNet models with reduced representation levels. It can be seen that, as we move from the pre-GPC residual layer to the GPC module and then the post-GPC residual layer, the features that the network extracts are increasingly abstract.
Predictions obtained from ResUNet, ContextNet constrained to three representation levels and full ContextNet are shown in fig. 5.
In this work, we introduced the GPCm in order to enhance the context perception capabilities of CNNs for the application of brain tumor segmentation. We investigated the behavior of the GPC modules by training networks with a limited number of representation levels and visualizing their intermediate representations. Finally, we showed that equivalent performance can be achieved using the GPCm even when the number of representation levels of the network is considerably reduced, as well as performance boosts when maintaining the complexity of the network.
The integration of ContextNet and related brain tumor segmentation models into the clinical workflow would reduce the time needed for treatment planning and monitoring, freeing up radiologists from tedious and error-prone tasks and allowing for more time to be invested into interpretation and diagnosis.
Future work includes uncertainty estimation via Monte-Carlo Dropout or related techniques, in-depth investigation of intermediate representations and use of other deep learning interpretability methods to better understand the behavior of the proposed GPCm.
1. Louis, D. N., Perry, A., Reifenberger, G., Von Deimling, A., Figarella-Branger, D., Cavenee, W. K., ... & Ellison, D. W. (2016). The 2016 World Health Organization classification of tumors of the central nervous system: a summary. Acta neuropathologica, 131(6), 803-820.
2. Ostrom, Q. T., Bauchet, L., Davis, F. G., Deltour, I., Fisher, J. L., Langer, C. E., ... & Wrensch, M. R. (2014). The epidemiology of glioma in adults: a “state of the science” review. Neuro-oncology, 16(7), 896-913.
3. Weiss, Elisabeth & F Hess, Clemens (2003). The Impact of Gross Tumor Volume (GTV) and Clinical Target Volume (CTV) Definition on the Total Accuracy in Radiotherapy. Strahlentherapie und Onkologie : Organ der Deutschen Röntgengesellschaft ... [et al], 179, 21-30.
4. Crimi, A., Bakas, S., Kuijf, H., Menze, B., & Reyes, M. (Eds.). (2018). Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: Third International Workshop, BrainLes 2017, Held in Conjunction with MICCAI 2017, Quebec City, QC, Canada, September 14, 2017, Revised Selected Papers (Vol. 10670). Springer.
5. Menze, B. H., Jakab, A., Bauer, S., Kalpathy-Cramer, J., Farahani, K., Kirby, J., ... & Lanczi, L. (2015). The multimodal brain tumor image segmentation benchmark (BRATS). IEEE transactions on medical imaging, 34(10), 1993.
6. Ronneberger, O., Fischer, P., & Brox, T. (2015, October). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Springer, Cham.
7. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
8. Peng, C., Zhang, X., Yu, G., Luo, G., & Sun, J. (2017, July). Large kernel matters—improve semantic segmentation by global convolutional network. In Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on (pp. 1743-1751). IEEE.