0880

AtlasSeg: Atlas Prior Guided Dual-UNet for Cortical Segmentation in Fetal Brain MRI

Haoan Xu¹, Tianshu Zheng¹, Xinyi Xu¹, Yao Shen¹, Jiwei Sun¹, Cong Sun², Guangbin Wang³, and Dan Wu¹
¹Department of Biomedical Engineering, Zhejiang University, Hangzhou, China, ²Department of Radiology, Beijing Hospital, Beijing, China, ³Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China

Synopsis

Keywords: Analysis/Processing, Segmentation, Fetal Brain MRI; Artificial Intelligence

Motivation: Automatic segmentation of fetal brain remains challenging partially due to the dynamically changing anatomical structures during fetal brain development.

Goal(s): To enhance segmentation accuracy through incorporating gestational age-specific information as a guidance, we introduce AtlasSeg, a dual-U-shape network with dense attentive interactions.

Approach: By providing atlas volume and segmentation label at the corresponding gestational age, AtlasSeg effectively extracts the contextual features of age-specific patterns and structures that assist segmentations.

Results: AtlasSeg demonstrated superior performance against six other segmentation networks in both standard and out-of-distribution experiments, in two fetal MRI datasets. Ablation tests further demonstrated the role of atlas guidance.

Impact: Through gestational age-specific atlas-guided information, AtlasSeg can serve as an accurate and robust automatic segmentation tool for its superior performance in both in-distribution and out-of-distribution tests, which is useful for quantitative analysis in large-scale fetal brain studies.

Introduction

In fetal brain MRI, accurate and reliable segmentation is desired for prenatal diagnosis and quantitative analysis¹. Previous learning-based segmentation often trained a UNet-like fully convolutional network (FCN) in an end-to-end manner to obtain tissue labels². However, the dynamically changing anatomical structure and morphology during fetal brain development poses a challenge, and the segmentation accuracy varied across gestational age (GA). In this paper, we incorporated the prior information of GA-specific atlases with segmentation labels into the network. To achieve this, we designed a dual-UNet architecture to process input fetal brain MRI image and corresponding atlas, and built dense attention connections between two parallel streams for deep feature fusion.

Methods

Data acquisition: A total of 102 fetal brain MRI (GA: 22.4-39.0 weeks) were collected on a 3T Siemens Skyra scanner with an abdominal coil. T2-weighted MRI were acquired in three orthogonal orientations with the following protocol: TR/TE=800/97ms, in-plane resolution=1.09×1.09mm, FOV=256×200mm, thickness=2mm, GRAPPA factor=2.
Data preprocessing: The acquired stacks of slices were reconstructed into 3D volumes with the resolution of 0.8×0.8×0.8 and size of 128×160×128 using NiftyMIC toolkit³. Ground truth of cortical labels were segmented using DrawEM and manually corrected by three raters. The atlases used in this work were generated through pairwise registration followed by iterative optimization based on group-wise registration⁴. The 102 cases were divided into three sets: 60 for training, 10 for validation and 32 for testing.
Network architecture: Figure 2A illustrates the structure of our proposed atlas prior guided dual-UNet. The network is fed with two inputs: the target fetal brain volume and its corresponding atlas with two channels of atlas image and label. These inputs are simultaneously processed through two parallel U-shape backbone networks, each having a fundamental encoder-decoder architecture with skip connections. The first UNet is dedicated to extracting the anatomical features of the target fetal brain, while the second UNet focuses on processing the atlas and learning the age-specific anatomy and segmentation.
Central to our framework is the dense attention connection mechanism (orange block in Figure 2A), which ensures a seamless flow of information between parallel UNets and the interaction of features at various stages. As shown in Figure 2C, the multi-scale attentive atlas fusion (MA2-Fuse) takes both target and atlas feature maps as input, and outputs the attention-enhanced feature. The deep fusion consists late concatenation and multi-scale convolution operations with a range of kernel sizes, both of which are designed to fully use the contextual map of expected anatomical features provided by atlas features.
Experiment detail: We trained our network with a combination of Dice loss and binary cross-entropy loss⁵. The evaluation metrics of segmentation accuracy consisted of the Dice score (Dice), the 95 percent Hausdorff Distance (95HD), and the Average symmetric Surface Distance (ASD). To show the superiority of proposed AtlasSeg, we compared it with other six segmentation FCNs: UNet⁶, SE-FCN⁷, DenseUNet⁸, UNet++⁹, AttentionUNet¹⁰ and the state-of-the-art model of MixAttNet¹¹. We also performed a set of ablation tests on the role of atlas guided attention mechanism.
The network was implemented in PyTorch and trained on NVIDIA GeForce RTX 3090 GPU with batch size of 4 and patch size of 96×96×96. Data augmentation consisted of random rotation, flip, contrast and deformation. All compared networks were trained with the same data, hyper-parameters and augmentation.

Results

Figure 3 demonstrates the results of in-distribution experiment on our ZJU dataset (Figure 3A) and out-of-distribution experiment tested on the FeTA dataset¹² (Figure 3B). The results indicated a superior performance of AtlasSeg, which achieved the highest Dice of 0.9172/0.7576 (for the ZJU/FeTA datasets), the least 95HD of 1.0259/1.6558 and ASD of 0.2531/0.6831, significantly higher than the other FCN methods.
Figure 4 shows the results of ablation experiments on the positions and manners of adding attention, which validated that dense fusion coupled with late concatenation and multi-scale spatial attention delivered optimal performance. Figure 5 illustrates the segmented label, feature maps and the corresponding attention maps, demonstrating how the attention highlight the cortical regions and thus leading to an accurate segmentation.

Discussion and conclusion

The proposed AtlasSeg outperformed other state-of-the-art FCNs for accurate cortical segmentation both in in-distribution and out-of-distribution experiments, and its superiority was further validated through ablation studies. The outstanding performance of AtlasSeg is likely associated with its incorporation of prior knowledge in the form of GA-specific atlases to provide contextual guidance, as well as the interaction based on attention-driven feature fusion between dual-UNet. Given its elevated accuracy and robustness, AtlasSeg stands as a promising tool for reliable fetal brain MRI tissue segmentation.

Acknowledgements

The work is supported by Ministry of Science and Technology of the People’s Republic of China (2018YFE0114600, 2021ZD0200202), National Natural Science Foundation of China (81971606, 82122032), and Science and Technology Department of Zhejiang Province (202006140, 2022C03057).

References

1. Makropoulos A, Counsell SJ, Rueckert D. A review on automatic fetal and neonatal brain MRI segmentation. NeuroImage. 2018;170:231-248.

2. Ciceri T, Squarcina L, Giubergia A, Bertoldo A, Brambilla P, Peruzzo D. Review on deep learning fetal brain segmentation from Magnetic Resonance images. Artificial Intelligence in Medicine. 2023;143:102608.

3. Ebner M, Wang G, Li W, et al. An automated framework for localization, segmentation and super-resolution reconstruction of fetal brain MRI. NeuroImage. 2020;206:116324.

4. Xu X, Sun C, Sun J, et al. Spatiotemporal Atlas of the Fetal Brain Depicts Cortical Developmental Gradient. J Neurosci. 2022;42(50):9435-9449.

5. Isensee F, Jaeger PF, Kohl SAA, Petersen J, Maier-Hein KH. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods. 2021;18(2):203-211.

6. Ronneberger O, Fischer P, Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation. May 2015. http://arxiv.org/abs/1505.04597.

7. Guha Roy A, Siddiqui S, Pölsterl S, Navab N, Wachinger C. ‘Squeeze & excite’ guided few-shot segmentation of volumetric images. Medical Image Analysis. 2020;59:101587.

8. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely Connected Convolutional Networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI: IEEE; 2017:2261-2269.

9. Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J. UNet++: A Nested U-Net Architecture for Medical Image Segmentation. July 2018. http://arxiv.org/abs/1807.10165.

10. Oktay O, Schlemper J, Folgoc LL, et al. Attention U-Net: Learning Where to Look for the Pancreas. May 2018. http://arxiv.org/abs/1804.03999.

11. Dou H, Karimi D, Rollins CK, et al. A Deep Attentive Convolutional Neural Network for Automatic Cortical Plate Segmentation in Fetal MRI. IEEE Trans Med Imaging. 2021;40(4):1123-1133.

12. Payette K, Li HB, De Dumast P, et al. Fetal brain tissue annotation and segmentation challenge results. Medical Image Analysis. 2023;88:102833.

Figures

(A) Overview of AtlasSeg framework. The atlas set consists of a two-channel data with GA-specific atlas volume and its corresponding tissue label, which is processed via atlas branch. Segmentation branch utilized GA-specific information from atlas branch through the attentive feature fusion interaction and output the label. (B) Visual comparison of fetal brains at different GA. (C) Both changing anatomy and data imbalance could result in unbalanced segmentation accuracy, with diminished performance at younger and older fetal brains.

(A) Architecture of AtlasSeg, which is built upon two parallel 3D UNet backbone with an encoder-decoder structure and skip connections. Dense attentive feature fusion modules (MF) interlink these branches as interactions to build a seamless features flow. The network takes three 96×96×96 patches as input. (B) Convolutional blocks used in (A). (C) Proposed deep multi-scale attentive atlas fusion (MA2-Fuse) module. A group of convolution with kernel size of [7,5,3,1] is used to extract multi-scale features. Late concatenation is always used for deep fusion.

Comparison of test results using different methods for (A) in-distribution of the ZJU fetal MRI data (n=32) and (B) out-of-distribution of the FeTA dataset (n=20). Both experiments showed the significant improvements achieved by the proposed AtlasSeg in terms of multiple metrics including Dice, 95HD and ASD. **p<0.01 and ***p<0.001, ****p<10^-4, by paired t-test comparison between the tested algorithm with AtlasSeg.

The performance of AtlasSeg based on (A) varying attention positions and (B) different fusion techniques. (A) demonstrates that denser interactions lead to enhanced performance. (B) shows that the combination of multi-scale spatial attention and late concatenation yields the optimal results.

(A) Predicted segmentation from AtlasSeg compared to the ground truth. (B) Visualization of feature maps and attention maps from the first encoder and the last decoder. The results illustrated the distinct characteristic patterns across different channels. Furthermore, the attention maps highlight the cortical regions, facilitating accurate segmentation.

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)

0880

DOI: https://doi.org/10.58530/2024/0880