1756

Focused MRI Segmentation: Leveraging Diffusion Models for Brain Tumor Segmentation in Low-Resolution Areas
Luis Carlos Rivera Monroy1, Tianqi Wang1,2, Vasileios Belagiannis2, and Andreas Maier1
1Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany, 2Electrical-Electronic-Communication Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany

Synopsis

Keywords: AI Diffusion Models, Segmentation

Motivation: Glioblastoma, the most prevalent and aggressive adult brain tumor, presents a therapeutic and monitoring challenge due to its diverse morphology and composition.

Goal(s): This study explores the efficacy of utilizing denoising diffusion models with the widely adopted U-Net architecture for enhanced segmentation performance.

Approach: The proposed framework notably improves the segmentation of the tumor, especially the core. This enhancement facilitates an advanced understanding of complex cases and potentially impacts specialist interventions.

Results: Our findings present promising results for further research into more intricate glioblastoma cases, thereby aiding in developing sophisticated, targeted treatment strategies for this disease.

Impact: This study advances U-Net architecture by integrating denoising diffusion models and specialized loss functions, elevating the precision of low-resolution brain tumor segmentation, with an emphasis on balancing improved accuracy against uncertain predictions.

Introduction

The human brain's complexity and sensitivity to change underscore the importance of accurate brain tumor detection, where even minor cell growth anomalies can be life-altering. Magnetic Resonance Imaging (MRI) stands out for its detailed visualization of brain structures. We propose an advanced U-Net architecture for refining MRI-based tumor segmentation in particularly challenging, low-contrast regions by combining a diffusion model with uncertainty metrics. Our work not only improves upon traditional methods but also exceeds recent 3D and attention-enhanced U-Net models in detecting intricate cerebral abnormalities.

Methods

The Brain Tumor Segmentation (BraTS) challenge provides a multi-institutional dataset comprising multi-parametric magnetic resonance imaging (mpMRI) for brain glioblastoma analysis1. In this study, we utilized the 2021 release of the BraTS dataset, which encompasses imaging data from 1,250 patients. Each patient's data set includes imaging modalities such as Fluid Attenuated Inversion Recovery (FLAIR), T1-weighted, T1-weighted post-contrast, and T2-weighted scans. Each sample contains labeled annotations from the tumor, peritumoral edema, and the necrotic and non-necrotic tumor core, exemplified in Figure 1.
We demonstrate our method's effectiveness by comparing it with leading techniques. The first technique encompasses the U-Net, an established segmentation architecture that combines context capture and precise localization2. The second is an advanced U-Net, the Attention U-Net, which incorporates attention gates. This architecture allows focusing on key image areas, leading to enhanced accuracy without relying on separate models3. The last baseline is an extended version of the original U-Net with advanced 3D data processing and volumetric training capabilities4.
Building on these established methods, our Diffusion U-Net, or Diff-UNet, innovates further by integrating a diffusion model within the U-Net architecture. This integration facilitates the distillation of semantic content from volumetric data, yielding enhanced pixel-level detail crucial for medical segmentation tasks. Distinct from its U-Net predecessor, the Diff-UNet embeds a noise process within the input pipeline and employs iterative label map prediction, which heightens the model's predictive stability. An additional feature of Diff-UNet is its unique uncertainty-based fusion module used during inference, which significantly increases the robustness of the segmentation outcomes5.
In this study, we utilized the Diff-UNet pipeline. Initially, this architecture was developed and validated for liver segmentation. We adapted its core principles to address multi-class brain tumor segmentation characterized by high variability and complex volumetric structures. Due to the low resolution and intricacy of these structures, the task is non-straightforward. During training, a denoising module takes multi-modal MRI data and systematically noise-altered labels as input. By combining a feature encoding stream with a denoising U-Net, we yield denoised predictions of the segmented regions of interest, which are aligned with their corresponding labels during training.
In the testing phase, the trained denoising module is applied iteratively to generate denoised segmentations. This step is repeated for a predetermined number of iterations, resulting in a series of outputs that progressively exhibit less noise, culminating in the desired multi-class segmentation. For the final inference, each output in the sequence is combined with an uncertainty map to enhance the method's robustness, preserving critical information obtained at each step. This information is then integrated into the final output of the architecture, producing a multi-class segmentation map of the tumor. Figure 2 presents an overview of this process.

Results and Discussion

Table 1 compares the different baseline models and proposed approach performance, highlighting the mean Dice score of the 3D U-Net over the 2D U-Net despite a lower HD95 measure. The 3D U-Net excels in volumetric data processing, crucial for complex brain tumor segmentation. In contrast, the 2D U-Net is limited to two-dimensional slices and may overlook some 3D features. Attention 3D U-Net slightly outperforms the standard 3D U-Net by employing attention blocks that focus on key regions, enhancing segmentation precision. The diffusion-embedded U-Net shows the best overall results by integrating a denoising diffusion model, improving robustness through noise addition during training, and employing a step uncertainty-based fusion module for testing.

Conclusion

In this study, various models were evaluated, with three traditional U-Net architectures serving as benchmarks. The 3D U-Net demonstrated superior Dice coefficient metrics compared to the 2D U-Net and required the least computational time. The 2D U-Net, however, excelled in delineating boundaries more effectively than its 3D counterpart. The Attention 3D U-Net offered marginal improvements over the conventional 3D U-Net but at the cost of increased computational time. The Diffusion U-Net, when used in the brain tumor segmentation context, utilizes the strengths of the diffusion model and uncertainty to improve robustness, yielding impressive results and showcasing promise for tasks involving multi-modal 3D medical image segmentation.

Acknowledgements

No acknowledgement found.

References

1. Menze, B. H., Jakab, A., Bauer, S., Kalpathy-Cramer, J., Farahani, K., Kirby, J., ... & Van Leemput, K. (2014). The multimodal brain tumor image segmentation benchmark (BRATS). IEEE transactions on medical imaging, 34(10), 1993-2024.

2. Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18 (pp. 234-241). Springer International Publishing.

3. Oktay, O., Schlemper, J., Folgoc, L. L., Lee, M., Heinrich, M., Misawa, K., ... & Rueckert, D. (2018). Attention u-net: Learning where to look for the pancreas. arXiv preprint arXiv:1804.03999.

4. Çiçek, Ö., Abdulkadir, A., Lienkamp, S. S., Brox, T., & Ronneberger, O. (2016). 3D U-Net: learning dense volumetric segmentation from sparse annotation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2016: 19th International Conference, Athens, Greece, October 17-21, 2016, Proceedings, Part II 19 (pp. 424-432). Springer International Publishing.

5. Xing, Z., Wan, L., Fu, H., Yang, G., & Zhu, L. (2023). Diff-UNet: A Diffusion Embedded Network for Volumetric Segmentation. arXiv preprint arXiv:2303.10326.

Figures

Figure 1: Multimodal brain imaging of a patient sample. The main image displays a 3D brain rendering with the tumor (peritumoral edema and the necrotic and non-necrotic tumor core) segmented. Adjacent smaller images showcase axial, sagittal, and coronal brain views, with the tumor segmentation overlaid on FLAIR MRI images.

Figure 2. Overview of the proposed framework. During (A), the MRI scans are fed into the denoising module, which iteratively produces a denoised output. In (B), the two main streams and how they are connected are illustrated. During (C), the MRI input, along with initial noise, is processed through the denoising module to produce outputs at different time steps. In (D), each output from the testing phase is coupled with an uncertainty map. These fused pairs generate the final segmentation.

Table 1: Comparative results for different U-Net architectures on the BraTS 2021 test dataset. The Diffusion U-Net model demonstrates superior performance compared to the other baseline methods in all measured metrics, including Whole Tumor (WT), Tumor Core (TC), and Enhancing Tumor (ET) Dice scores, as well as the average Dice score and Hausdorff distance (HD95). Furthermore, despite having a considerably longer training time, the improvements in accuracy and precision metrics signify that the Diffusion U-Net is a more robust model for the brain segmentation task.

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)
1756
DOI: https://doi.org/10.58530/2024/1756