4732

Fully Automatic Learning-based Multi-Organ Segmentation(ALMO) in abdominal MRI for Radiotherapy Planning using Deep Neural Networks

Yuhua Chen^1,2, Yujin Xie³, Lixia Wang^1,4, Jiayu Xiao¹, Zixin Deng¹, Yi Lao⁵, Richard Tuli⁵, Debiao Li¹, Wensha Yang⁵, and Zhaoyang Fan¹

¹Biomedical Imaging Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA, United States, ²Biomedical Engineering, UCLA, Los Angeles, CA, United States, ³Beihang University, Beijing, China, ⁴Beijing Chaoyang Hospital, Capital Medical University, Beijing, China, ⁵Department of Radiation Oncology, Cedars-Sinai Medical Center, Los Angeles, CA, United States

Synopsis

Precise dose measurement is critical in radiotherapy planning, which involves accurate and fast segmentation of the organ for estimation of the region at risk. Segmentation Magnetic Resonance Imaging (MRI), as it is gaining more favor against CT in radio therapy, is new for multi-organ segmentation task. In this work, we proposed a fast, accurate, and fully automatic technique (ALMO) that reliefs the intense human labor from manual segmentation in a timing fashion. On our 51-subject dataset, our proposed method achieves an average dice score of 0.76 in the test set in seconds.

Introduction

Magnetic Resonance Imaging (MRI) has recently gained strong interest in serving as an alternative to CT for radiotherapy planning, primarily due to its superior soft-tissue contrast and thus accurate delineation of lesions and surrounding normal tissues and organs. Precise treatment requires that the target volume and organs at risk are accurately outlined, which is, however, typically performed manually by radiotherapists and thus time-consuming and susceptible to interobserver variation and errors. Computerized segmentation has been extensively investigated on CT images [1] but rarely on MR image. In this work, we propose a deep learning-based segmentation system ALMO for multi-organ segmentation on MR images.

Methods

The work utilizes a learning based deep convolutional neural network to perform the segmentation task. Our network is based on the U-net [2] design which takes in the pixels from MRI and output the segmentation mask for each organ. We acquired T1 VIBE (Volumetric Interpolated Breath-hold Examination) images from 51 subjects, in-plane spatial resolution varied from 1.1 mm to 1.3 mm and slice thickness was 3 mm. Segmentation labels for nine dose-sensitive organs (liver, pancreas, right kidney, left kidney, stomach, duodenum, small intestine, spinal cord and spine) were manually drawn by two radiologists and was used as the ground-truth for training and testing. The 51 subjects were then randomly split the 51 into 41-subject training and 10-subject testing groups. As preprocessing, we interpolated all the images and labels into 1.2 mm isotropic resolution in all three dimensions. During the training, an image patch of 256x160x20 voxels from 20 consecutive layers were randomly cropped from the training dataset and fed into the network. Random flipping was applied to further augment the dataset.

We implemented two different configurations of U-net structure in Tensorflow [3] platform: one has the stacked convolutional layer blocks (plain-Unet) which is the most popular network for segmentation and the other has the densely connected block (dense-Unet [4]) which is proven to be more efficient in computation and less likely to be overfit to training data. Network details are shown in Figure.1. Multi-classes cross-entropy between the label and the network prediction is used as the cost function to be optimized.

In training, we used the Adam [5] optimizer with a learning rate of 1e-4 in a 0.9 decay rate of every 50k iterations. Models were trained for 300k iteration with a batch size of 1. Training and testing was performed on a workstation with Intel Xeon processors and Nvidia GTX 1080 TI GPUs. The model checkpoint with the best validation loss during the training will be used for performance assessment. We used Dice [6] coefficient and Jaccard [7] index between the predictive and manual labeled segmentation as the quality metrics.

Results

As shown in the Table. 1, the plain-Unet has an average dice score of 0.76 and the Dense-Unet has a dice of 0.74. Despite the small performance difference, Dense-Unet runs 4 times faster and requires less memory, which makes it preferable in practical use. Sample cases are shown in Figure 2. Though the network was trained on human labels, our proposed method provides a more consistent segmentation outputs compared to manual labels as shown in Figure 3.

Discussion and Conclusions

In this work, we proposed a fully automatic segmentation technique ALMO using a deep neural network. Computerised segmentation can free intensive human labors in labelling and provide highly reproducible results. Deep learning-based methods may fulfill this goal while substantially reducing the computation time. Our technique showed high quality results, although it was only trained on a small dataset. The newer Dense-Unet structure required much less computational resources, making it possible to deploy on low-end computers. It is currently challenging to integrate MR scanning to the radiotherapy planning workflow as manual segmentation is time consuming. This could be overcome by using the proposed deep learning technique that can accomplish the task with 6 seconds. Adaptive planning may also be possible given the fast and reliable segmentation results from such a method. In the future, we will further generalize the network by transfer training on a larger dataset and improve the network structure to make it performs better and faster.

Acknowledgements

No acknowledgement found.

References

1. Gibson, Eli, et al. "Automatic multi-organ segmentation on abdominal CT with dense v-networks." IEEE Transactions on Medical Imaging (2018).

2. Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net: Convolutional networks for biomedical image segmentation." International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015.

3. Abadi, Martín, et al. "Tensorflow: a system for large-scale machine learning." OSDI. Vol. 16. 2016.

4. Huang, Gao, et al. "Densely connected convolutional networks." CVPR. Vol. 1. No. 2. 2017.

5. Kingma, Diederik P., and Jimmy Ba. "Adam: A method for stochastic optimization." arXiv preprint arXiv:1412.6980 (2014).

6. Griffiths, Robert I., et al. "Rapid method for coextraction of DNA and RNA from natural environments for analysis of ribosomal DNA-and rRNA-based microbial community composition." Applied and environmental microbiology 66.12 (2000): 5488-5491.

7. Hamers, Lieve. "Similarity measures in scientometric research: The Jaccard index versus Salton's cosine formula." Information Processing and Management 25.3 (1989): 315-18.

Figures

Network structure is shown above: (a) The U-net, the convolutional blocks are either the (b) Densely-Connected Block or (c) Plain Connected Block. The kernel size is 3x3. Batch normalization is used before the Exponential Linear Units (ELU) activation. 2x2 average pooling is used in the Transition Down, and a transposed-convolutional layer with 2x2 kernel and stride size is applied in the Transition Up. In plain-Unet setting, the filter number starts at 64 and doubles after pooling; in dense-Unet, the growth-rate K=16. A final layer with 1x1 kernel and softmax activation output the predictions for 10 classes (background + 9 organs).

Segmentation results on a random test case: liver(red), pancreas(green), right kidney(blue), left kidney(yellow), stomach(cyan), duodenum(purple), small intestine(white), spinal cord (light brown) and spine (dark brown).

Sagittal view of the manual label and auto segmentation output. It is very difficult for human labeler to keep ideal consistence along the through plane direction as usually segmentation is drawn in-plane.

Table 1. Performance comparison between two U-net on the test set (n=10). The mean metric scores over all subjects per organ are listed above. Average scores over all the organs are also shown. Though plain U-net performs slightly better, the Dense-Unet runs much faster. Test was performed on a single GTX 1080TI graphic card.

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)

4732