2122

Automatic Right Ventricular Segmentation of Cardiac Cine Magnetic Resonance Images Based on a Novel Multi-atlas Two-Stage U-net

Lijia Wang¹ and Hanlu Su¹
¹School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, China

Synopsis

Keywords: Segmentation, Heart

Motivation: Right ventricular(RV) segmentation is of great significance for the clinical diagnosis of heart diseases. However, due to the complex structure, RV segmentation is still challenging.

Goal(s): Fully automatic and accurate segmentation of the right ventricle.

Approach: A new deep atlas network that combines atlas prior knowledge with Deformable Multi-scale Two-Stage U-net(DMTSU-net) is proposed to extract and fuse multi-scale RV features in Cine Cardiac Magnetic Resonance (CCMR) images.

Results: Compare with 8 classical methods, the segmentation results of DMTSU-net are mostly close to the gold standard and significantly correlate with it on all evaluation indices in 15 testing datasets.

Impact: The proposed framework integrates prior information of atlases into a deep neural network to achieve accurate segmentation, which is promising for clinical heart disease diagnosis.

Introduction

RV segmentation is necessary for cardiac function analysis in clinical diagnosis. Due to the blurry boundaries, irregular shapes, and complex structures and thin myocardium, RV segmentation is challenging¹. Although deep learning methods have made significant progress in biomedical image segmentation in recent years, the segmentation results of existing deep learning models are still not so satisfying and cannot meet clinical applications.

Methods

As shown in Figure 1, the registration network DMTSU-net is applied to obtain the deformation field parameters(θ) mapped from the Atlas images(A)to Target Images(T), which are used by the spatial transformation layer² to perform spatial transformation on the Atlas images(A) and Atlas Labels(L_A) to generate the Registration Images $$$F_{\theta}(A)$$$ and Results(L_S_eg) respectively.
The detailed structure of DMTSU-net is shown in Figure 2, which is a cascade structure with two U-shaped network, including two encoders and decoders. The encoders for down-sampling can compress and extract image features to the decoders through skip connections, then the decoders restore image information during the up-sampling stage. In the first U-shaped network, deformable convolution instead of standard convolution is used in the down-sampling. Deformable convolution adds a two-dimensional offset, which is learned from the previous feature map. Dilated Inception blocks(marked in red) are used in both the up-sampling of the two U-shaped networks and the down-sampling of the second U-shaped network. This block composed of dilated convolutions with dilation rates of 1, 2, and 5, which can extract and fuse multi-scale RV features.
The loss function is as follows:$$L(T,A,L_{T},L_{A})=L_{dif}(T,F_{\theta}(A))+L_{S}(L_{T},L_{Seg})$$
L_difis a function to calculate the mean squared error (MSE) between T and $$$F_{\theta}(A)$$$ ,and L_Srepresents dice metric (DM) of Target Labels L_T and L_S_eg .
91 short axis CCMR datasets are randomly selected in this experiment, among which 20 datasets are used as the atlas set for registration. Typical scan parameters are: field of vision (FOV) is 360×360mm, thickness is 6~8mm, gap between slices is 2~4mm, image matrix size is 256×256. 42 datasets are randomly selected as training sets, 14 datasets as validation sets, and 15 datasets as testing sets.

Results

DM and Hausdorff Distance(HD) are used for quantitative analysis, where DM is defined as: $$DM(s_{n},g_{n})=\frac{2|s_{n}\cap g_{n}|}{|s_{n}|+|g_{n}|}$$
where s_nand g_nrepresent the predicted results and the gold standard respectively. HD is defined as:$$HD(A,B)=max(max_{a\in A}(min_{b\in B}d(a,b)),max_{b\in B}(min_{a\in A}d(a,b))$$
where A and B are the set of predicted contours and ground truth, respectively, and d(a,b) is the Euclidean distance between a and b. Correlation and consistency analysis methods including ejection fraction(EF), stroke volume(SV), end diastolic volume (EDV), and end systolic volume (ESV).
The results of different layers for four typical patients are shown in Figure 3(a). The average DM and HD of 15 testing set in end diastolic(ED) and end systolic(ES) are 0.942 and 0.921, 2.835mm and 3.348mm, respectively. The correlation and Bland-Altman curves of EDV, ESV, EF, and SV in the testing set are shown in Figure 4. The correlation coefficients of the above four parameters are 0.990, 0.980, 0.990, and 0.995, respectively. Except for one or two outliers, all measured values are within the consistency limit, indicating that the proposed method provides acceptable and consistent results.

Discussion

DMTSU-net is compared with classical segmentation networks including DMU-net³, U-net++⁴, U-net⁵, FCN⁶, SegNet⁷, PSPNet⁸, and Multi atlas⁹. The results are shown in Figure3(b) and Table 1, which indicate that the proposed DMTSU-net can achieve the best segmentation results in the middle and bottom slices.

Conclusion

The proposed deep atlas network integrates atlas prior knowledge into the deep learning network, which can accelerate the training process of the network and improve segmentation accuracy. The experimental results indicate that the DMTSU-net has the potential for effective RV segmentation in CCMR image.

Acknowledgements

No acknowledgement found.

References

1. Caroline P, Maria A., Z, Wenjia B, et al. Right ventricle segmentation from cardiac MRI: A collation study. Med Image Anal. 2015;19(1): 187–202.

2. Jaderberg M, Simonyan K, and Zisserman A. Spatial transformer networks. Proc Adv Neural Inf Process Syst. 2015;2017–2025.

3. Liu P, Yumin Z, Lijia W. Automatic segmentation of right ventricle in CMR image based on Dense and Multi-scale U-net network. Chinese Journal of Magnetic Resonance. 2020;37(04):456-468.

4. Zhou Z W, Siddiquee M M R, Taibakhsh N, et al. UNet++: A nested U-Net architecture for medical image segmentation. Computer Vision and Pattern Recognition. 2018;11045:3-11.

5. Ronneberger O, Fischer P, Brox T. U-Net: Convolutional networks for biomedical image segmentation. Computer Vision and Pattern Recognition. 2015. arXiv:1505.04597.

6. Tran P V. A fully convolutional neural network for cardiac segmentation in short-axis MRI. Computer Vision and Pattern Recognition, 2016. arXiv:1604.00494.

7. Badrinarayanan V, Kendall A, Cipolla R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE T Pattern Anal, 2017;39(12): 2481-2495.

8. Zhao H S, Shi J P, Qi X J, et al. Pyramid scene parsing network. Computer Vision and Pattern Recognition, 2017; 6230-6239.

9. Wang L J, Su X Y, Li Y, et al. Segmentation of right ventricle in cardiac cine MRI using COLLATE fusion-based Multi-Atlas. Chinese Journal of Magnetic Resonance, 2018;35(4): 407-416.

Figures

Figure1:The architecture of the proposed deep atlas network

Figure 2 :The architecture of Deformable Multi-scale Two-Stage U-net

Figure 3 : (a)Examples of four patient results on diﬀerent slices at ES, (b)Results of the same patient using different methods on different slices at ES.

Figure 4 :Correlation(a) and Bland-Altman(b) curves of EDV, ESV, EF and SV for 15 patients.

Table 1: Dice and HD values acquired using the proposed and other segmentation methods

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)

2122

DOI: https://doi.org/10.58530/2024/2122