0833

A Cascaded Residual UNET for Fully Automated Segmentation of prostate and peripheral zone in T2-weighted 3D Fast Spin Echo Images

Lavanya Umapathy¹, Wyatt Unger², Faryal Shareef², Hina Arif², Diego Martin², Maria Altbach², and Ali Bilgin^1,3

¹Electrical and Computer Engineering, University of Arizona, Tucson, AZ, United States, ²Department of Medical Imaging, University of Arizona, Tucson, AZ, United States, ³Biomedical Engineering, University of Arizona, Tucson, AZ, United States

Synopsis

Multi-parametric MR images have been shown to be effective in the non-invasive diagnosis of prostate cancer. Automated segmentation of the prostate eliminates the need for manual annotation by a radiologist which is time consuming. This improves efficiency in the extraction of imaging features for the characterization of prostate tissues. In this work, we propose a fully automated cascaded deep learning architecture with residual blocks (Cascaded MRes-UNET) for segmentation of the prostate gland and the peripheral zone in one pass through the network. The network yields high dice scores (mean=0.91) with manual annotations from an experienced radiologist. The average difference in volume estimation is around 6% in the prostate and 3% in the peripheral zone.

Motivation

Prostate Cancer is one of the leading causes of cancer death among American men¹. Multi-parametric MR imaging (MP-MRI)² has been shown to be effective in non-invasive diagnosis and staging of clinically significant prostate cancer. Along with anatomical information from T2-weighted imaging, MP-MRI can utilize quantitative parameter maps such as T2, Apparent Diffusion Coefficient (ADC), and T1. Figure 1 shows representative images for a subject from the MP-MRI protocol used in our work. Features obtained from these quantitative maps contain valuable information for characterization of prostate tissue. The first step in automated MP-MRI processing is the segmentation of the prostate which eliminates the need for time-consuming manual annotations. Deep learning networks, specifically, Convolutional Neural Networks (CNNs), have been used in a wide variety of medical image segmentation tasks, including prostate segmentation^3,4. In this work, we propose a fully automated cascaded deep learning network architecture with residual blocks, Cascaded MRes-UNET, for segmentation of the prostate gland and the peripheral zone (PZ).

Methods

Figure 2 shows the architecture of proposed Modified Residual UNET (MRes-UNET). This architecture is a modified version of UNET⁵ with residual blocks within the analysis and synthesis paths. Furthermore, instead of the feature concatenations used in UNET, the proposed architecture uses feature addition. Finally, the residual blocks^6,7 use 1x1 convolutions along the identity paths. Figure 3 shows the proposed fully automated cascaded architecture that consists of two sequential MRes-UNETs. Given an input image, the first MRes-UNET predicts the mask for the prostate gland. The detected prostate region is extracted from the image and is used as input to the second network which predicts the PZ within the prostate.

3D T2-weighted Fast Spin Echo (FSE) images with 1 mm isotropic resolution (matrix size: 256x256) were acquired on 3T (Siemens Skyra) on a total of 73 patients screened for prostate biopsy. 65 subjects were used for training the cascaded network and 3 were used for validation. 5 subjects were held out as test subjects to validate the generalizing ability of the proposed technique. The prostate gland and PZ were annotated on axial T2-w images by experienced radiologists. Pre-processing steps involved cropping the images to 192x192 to reduce computational burden and normalizing each subject’s data to have zero mean and unit standard deviation. Data augmentation was performed using a combination of random rotations (-10°,10°), in-plane translations, and horizontal flips, resulting in a four-fold increase with roughly 17000 training images.The networks were trained in Keras⁸ with tensorflow⁹ background with the following parameters: weights=random initialization, Loss=categorical cross-entropy, LR=0.0005, batch size = 5 and epochs = 30. The performance of the proposed technique was evaluated using dice similarity coefficients, average volume difference, precision and recall.

Results and Discussion

The training and validation loss curves for the two MRes-UNETs are shown in Figure 4A and 4B, respectively. It can be seen that augmentation of the data helps to avoid the problem of overfitting. The table in Figure 4C shows the evaluation metrics for prostate and peripheral zone segmentation for the 5 test subjects. The average dice scores across all the test subjects show consistent performance with high average dice values (~0.91) and low standard deviation. The average volume difference (AVD) between the ground truth and the predicted masks is around 6% in prostate and 3% in peripheral zone. Figure 5 shows T2-weighted FSE images from two subjects with manual annotations for the prostate and PZ (top row) as well as predictions from the cascaded network (bottom row). End-to-end prediction in the network takes approximately 32 seconds per subject.

Conclusion

A cascaded fully automated MRes-UNET architecture was proposed for automated segmentation of prostate and peripheral zone from T2-weighted FSE images in the MP-MRI protocol. We observed high average dice scores (~0.91) on 5 test subjects. Increase in the volume of prostate has been shown to be related to the rate of prostate cancer^10,11 and as such accurate estimation of prostate volume is crucial. In our work, we observed less than 6% error in the estimation of prostate volume. The segmentation masks obtained from the proposed approach can be applied to quantitative maps co-registered to the T2-weighted images to extract features of interest for diagnosis and staging of prostate cancer.

Acknowledgements

The authors would like to acknowledge support from the Technology and Research Initiative Fund (TRIF) Improving Health Initiative.

References

1. Siegel R, Naishadham D, Jemal A. Cancer statistics, 2012. CA Cancer J Clin. 2012;62(1):10–29.

2. Kozlowski P, Chang SD, Jones EC, Berean KW, Chen H, Goldenberg SL. Combined diffusion-weighted and dynamic contrast-enhanced MRI for prostate cancer diagnosis:correlation with biopsy and histopathology.J Magn Reson Imaging. 2006;24(1).

3. MICCAI Grand Challenge: Prostate MR Image Segmentation. 2012.

4. Tian Z, Liu L, Fei B. Deep Convolutional neural network for prostate MR segmentation. Int J Comput Assisst Radiol Surg. 2018; 13(11)

5. Ronneberger O, et al. “U-Net: Convolutional Networks for Biomedical Image Segmentation” MICCAI, Springer, LNCS. 2015

6. He K, Zhang S, Ren S, Sun J. Deep Residual Learning for Image Recognition. 2015. arXiv:1512.03385v1

7. Guerroro R, et al. White matter hyperintensity and stroke lesion segmentation and differentiation using convolutional neural networks. NeuroImage: Clinical. 2018; 17

8. Chollet, Francois and others. 2015. https://keras.io

9. Abadi M, et al. TensorFlow: Large-scale machine learning on heterogeneous systems. 2015. Software available from tensorflow.org

10. Garvey B, et al. Clinical value of prostate segmentation and volume determination on MRI in benign prostatic hyperplasia. Diagn Interv Radiol. 2018; 20(3)

11. Doluoglu OG, et al. The importance of prostate volume in prostate needle biopsy. Turkish Journal of Urology. 2013; 39(2)

Figures

Figure 1: A T2 weighted (T2w) image of the prostate from a subject used in this work, along with the corresponding quantitative maps of T2, Apparent Diffusion Coefficient (ADC) and T1 (left to right) are shown. The parameter maps are registered to the T2w image. Segmentation of the prostate and peripheral zone on the high-resolution anatomical image can provide masks that could be used to extract relevant parameter information for prostate tissue characterization.

Figure 2: A) An illustration of the MRes-UNET, a modified UNET architecture where convolutional blocks are replaced by residual blocks, as shown above. The identity path in the residual block is replaced by 1x1 convolutions. The synthesis path in MRes-UNET consists of feature addition instead of concatenation. A categorical cross-entropy loss function is used for training.

Figure 3: The cascaded MRes-UNET architecture is shown here. The first MRes-UNET (A) takes in 2D images of the prostate and segments the prostate. The second MRes-UNET (B) takes the predicted prostate mask to extract the prostate and predicts the mask for peripheral zone. End to End prediction for a subject with 192 slices (256x256 matrix size) takes 32 seconds.

Figure 4: The evolution of categorical cross-entropy loss on training A) and validation images B) is shown. MRes-UNET was trained for 30 epochs with a learning rate of 0.0005 and Adam optimizer. The evaluation metrics for prostate and peripheral zone segmentation for the 5 test subjects are shown in C). The dice scores across all the test subjects show consistent performance with high average dice values (~0.91) and low average volume difference.

Figure 5: Representative T2-weighted images from two subjects (shown in A and B, respectively) with ground truth prostate mask (purple) and peripheral zone mask (orange) overlaid in the first row. The second row shows the predicted prostate masks obtained using the proposed cascaded MRes-UNET.

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)

0833