1619

Developments of Unet, Unet plus Conditional Random Field Insert, and Bayesian Vnet CNNs for Zonal Prostate Segmentation
Peng Cao1, Susan Noworolski1, Sage Kramer1, Valentina Pedoia1, Antonio Westphalen1, and Peder Larson1

1Department of Radiology, University of California at San Francisco, San Francisco, CA, United States

Synopsis

We studied 2d and 3d fully convolutional neural network for zonal prostate segmentation from T2 weighted MRI data. We also introduce a new methodology that combines Unet and conditional random field insert (CRFI) to improve the accuracy and robustness of the segmentation.

Introduction

~70% of prostate cancer appears in the peripheral zone. Therefore, the zonal prostate segmentation, contouring peripheral and central zones, could serve as the first processing step in the prostate tumor detection pipeline. Automatic segmentation provides essential and efficient tools for accurate measurements, primarily because manual segmentation is impractical for analyzing massive 2D/3D datasets. We here developed and tested several fully conventional networks, including Unet 1, Unet plus a simplified conditional random field 2 as inserts, and Bayesian Vnet 3,4, to efficiently segment prostate from multi-slice T2 weighted MRI data.

Methods

Unet: The first fully conventional network was created based on a Unet 1 structure (in Fig. 1a) with parameters: kernel size = 8×8, three decomposition levels, number of convolutional layers for the first/second level = 6, no convolutional layers for the third level, number of extracted features = 32, pooling size = 2×2, ReLU activation function and “concatenate”-merging layers. Unet+CRFI: as illustrated in Fig. 1b, three simplified conditional random field layers as inserts were added to above Unet structure. The CRFI combines the input and CNN layer output based on their spatial similarity as governed by the exponential vector activation and CNN based weighted sum and bilateral filter. Vnet: Vnet decomposed the image structural information into four different levels 5. This neural network was designed to process volumetric MRI, which can utilize the shared information in adjacent MRI slices. We also implemented Bayesian Vnet based on the dropout strategy in reference 3. Training and testing: As an initial experiment, 29 MRI cases were used in training; while, 6 cases were used in testing. For the training of all networks, the loss function was defined as a joint negative logarithm of Dice coefficient and weighted cross entropy loss. The data augmentation methods includes shift, rotation, zoom-in, scaling, and adding noise. All the neural networks were implemented in Tensorflow software (https://www.tensorflow.org/).

Results and discussion

Fig. 2 shows typical neural network segmentation results from 4 subjects. The neural network was able to predict the correct zonal boundary in most cases even the human labels were inaccurate (Fig.2, blue arrows). The Unet with CRF insert (CRFI) can also provide slight better contour detection/interpolation compared with that of Unet (Fig. 2, orange arrows). Fig. 3 shows the training and testing soft Dice coefficients at each epoch for Unet with and without CRFI. The Unet with CRFI showed more rapid convergence in training compared with Unet. In Fig. 4, the Bayesian network allows one to measure the uncertainty of the estimation. The results from 3 subjects indicate the significant uncertainties appeared on the top surface of the central zone. In Table 1, the Unet + CRFI method achieved the highest Dice score, i.e., 0.75, 0.82, and 0.89 for the peripheral zone, central zone, and whole gland, respectively, among those methods. The 2D Unet performed slightly better than that of 3D Vnet. This likely was caused by the human label was performed in a slice by slice manner and potential under-fitting in 3D Vnet.

Conclusion

In summary, three fully convolutional neural networks for zonal prostate segmentation are presented. The Unet with CRFI showed more rapid convergence in training compared with Unet and achieved the highest Dice score. Testing our method on a broader set of clinical data will be studied in the future.

Acknowledgements

No acknowledgement found.

References

1. Ronneberger, O., Fischer, P. & Brox, T. U-net: Convolutional networks for biomedical image segmentation. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics) 9351, 234–241 (2015).

2. Teichmann, M. T. T. & Cipolla, R. Convolutional CRFs for Semantic Segmentation. (2018).

3. Zhao, G. et al. Bayesian convolutional neural network based MRI brain extraction on nonhuman primates. Neuroimage 175, 32–44 (2018).

4. Conference, I. & Hutchison, D. Functional Imaging and Modelling of the Heart. 10263, (2017).

5. Milletari, F., Navab, N. & Ahmadi, S. A. V-Net: Fully convolutional neural networks for volumetric medical image segmentation. Proc. - 2016 4th Int. Conf. 3D Vision, 3DV 2016 565–571 (2016). doi:10.1109/3DV.2016.79

Figures

Figure 1 (a) Illustration of a 2D fully convolutional neural network created based on a Unet structure that could decomposition image information into three compression levels. The kernel size was 8X8 and 32 features were extracted in each convolutional layer. (b) A modified version of Unet with CRF inserts (CRFIs), those applied to the encoder side of Unet. (c) Illustration of the connection for CNN layer (gray) and the recurrent CRFI (orange). The CRFI combines the input and CNN layer (gray) output based on their spatial similarity as governed by the exponential vector activation and CNN based weighted sum and bilateral filter (orange).

Figure 2 representative typical neural network segmentation results from 4 subjects. T2 weighted MR images (T2WIs) were inputted into neural network. The neural network predicted and human labeled peripheral and central zones were contoured and overlaid on top of MRI image. Note that neural network was able to predict the correct zonal boundary in most cases even the human labels were inaccurate (blue arrows). The Unet with CRF insert (CRFI) can also provide slight better contour detection/interpolation compared with that of Unet (orange arrows).

Figure 3 The training and testing soft Dice coefficients at each epoch for Unet with and without CRF insert (CRFI). The Unet with CRFI showed more rapid convergence in training compared with Unet, indicating the CRFI may helped focusing the attention of Unet to the local spatial features.

Figure 4 Another 3D Bayesian Vnet was implemented. The Bayesian network allows one to measure the uncertainty of the estimation. The results from 3 subjects indicate the major uncertainty in the segmentation appeared on the top surface of central zone.

Table 1

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)
1619