0468

Integrating Imaging Prior Knowledge in Deep Convolutional Network - A Novel Approach of Cranial Pseudo-CT Generation

Max W.K. Law¹, Gladys G. Lo², Yihang Zhou¹, Jing Yuan¹, Oilei O.L. Wong¹, and S.K. Yu¹

¹Medical Physics and Research, Hong Kong Sanatorium & Hospital, Hong Kong, Hong Kong, Hong Kong, ²Department of Diagnostic & Interventional Radiology, Hong Kong Sanatorium & Hospital, Hong Kong, Hong Kong, Hong Kong

Synopsis

A deep convolutional network for cranial pseudo-CT generation was developed with the consideration of prior knowledge involved in radiotherapetic imaging workflow. This prior knowledge has been scarcely studied along with deep learning. It could greatly reduce the complexity of image data handled by the network. Examined on 14 sets of DIXON-MR and CT images, the proposed model achieved low generalization gap and offered accurate results regardless of the amount of training data. It achieved an average of 89.77±29.32HU mean-absolute-difference in two-fold cross-validation. It is experimentally shown that the proposed method is well-suited for generating clinical pseudo-CT for radiotherapeutic applications.

Introduction

MR-only planning is drawing more attention from the radiotherapy community because of superior soft-tissue contrast and ionizing-radiation free, as compared to conventional CT. Its success heavily relies on pseudo-CT generation - estimating radiation-attenuation information from MR. This study designed a deep convolutional neural network (CNN) with the consideration of imaging prior knowledge to generate cranial pseudo-CT. Such prior knowledge has been scarcely studied along with CNN but could greatly reduce the complexity of the image data handled by deep networks.

The proposed method exhibited low generalization gap. It was capable of reporting correct CT number in cortical-bone and air-cavity regions, both of which produced weak MR signals. Based on 14 imagesets, multiple experiments were conducted using 2, 6 and 8 subjects as training, showing insignificant accuracy fluctuations with varying amount of training data. It achieved an average of 89.77±29.32HU mean-absolute-difference (MAD) in two-fold cross-validation. It is experimentally shown that the proposed method is well-suited for generating pseudo-CT for radiotherapeutic applications.

Method - Data Acquisition and Preparation

14 patients (Age:61.5±11.1, 7-female/7-male) received same-day CT- and MR-simulations for Cyberknife treatments. Using external laser and immobilization devices (Fig.1), their scanning positions were identical to treatment positions. MR was rigid-registered to CT and binary head masks were generated by thresholding the smoothed and registered MR (Fig.2). The CT and MR were subsequently normalized to maintain the data dynamic range, without altering the quantitative nature of CT.

Data was divided into two groups of 6 and 8 imagesets, to perform two-fold cross-validation. An additional test randomly selected only 2 imagesets as training while 12 imagesets for testing. The accuracy variation with different amount of training data was inspected.

Prior to entering the network, the patient-left portions of the images were cropped and the patient-right portions were cropped and flipped (Fig.3). On one hand, the network handled only the patient-left intensity-patterns, with remarkably lower data complexity. On the other hand, the flipped patient-right images served as additional training samples.

Method - Network Design

The proposed network was a simple fully convolutional network with channel counts progressively increased towards the output (Fig.3). The earliest eight layers of 3x3x3 kernels captured local image patches to distinguish air-cavity from cortical-bone when weak MR signals were observed. The latest five 1x1x1 kernel layers increased the model non-linearity [3] to handle the relationship of mapping MR inputs to CT outputs.

This model has less than 100 thousands free parameters, approximately 0.3% of a recently proposed method [6] and 0.5% of the total masked voxel count of one imageset. This naturally avoided overfitting, yielding low generation gap [5] even if regularization was unused.

Result

The thresholded pseudo-CT were shown along with the original CT (Fig.4). The air-cavity and cortical-bone around nasopharynx and ear canal were correctly detected. The fine details at zygomatic arch and nasal bone were also successfully recovered. Both air-cavity and cortical-bone yielded weak to no MR signals, but corresponded to extremely different CT number. The proposed network captured local image patches to recognize such data non-linearity and performed promising pseudo-CT generation.

The proposed method exhibited 89.77HU MAD (Fig.4) to CT, approximately 8% of average absolute-HU inside the head mask of CT. Part of this difference stemmed from the imaging workflow - inter/intra scan variations (in-scan patient movement, across-scan soft-tissue deformation), rigid-registration inaccuracy, MR geometric-distortion and thermoplastic immobilization (CT-visible/MR-invisible).

Discussion and Conclusion

The proposed network showed insignificant accuracy variations when different amounts of training information presented (Fig.4). It experimentally showed that the proposed network could tolerate largely different training data quantity without adjusting the training procedure. Fig.5 summarized imaging information being incorporated into the proposed model, that is noticeably dissimilar to deep networks designed for typical image processing applications. In which, overfitting networks are encouraged along with manual fine-tuning of hyper-parameters (red items, Fig.5) to achieve prime performance.

One of the major differences between medical image applications and typical ones is the absence of real data groundtruth. For instance, it is impossible to independently acquire two identical CT of one subject owing to inter/intrascan variations, implying the inequivalence between CT and the real groudtruth of pseudo-CT generated from independently acquired MR. A set of optimal model hyper-parameters could be invalid because of variations in training, validation and testing samples. Nonetheless, the proposed network comprised of a small amount of free parameters and minimized the use of techniques involving hyper-parameters tuning. It exhibited low generalization gap and high accuracy. It thus well-suited the generation of cranial pseudo-CT for radiotherapy applications. The future direction of this work would be inspecting the impact of MR geometric-distortion and dosimetric evaluation between radiotherapy treatments planned according to pseudo-CT and conventional CT.

Acknowledgements

No acknowledgement found.

References

[1] E. Hoffer, I. Hubara, D. Soudry, "Train longer, generalize better: closing the generalization gap in large batch training of neural networks", Advances in Neural Information Processing Systems (NIPS 2017)

[2] E. Tryggestad, M. Christian, E. Ford, C. Kut, Y. Le, G. Sanguineti, D.Y. Song, L. Kleinberg,"Inter- and Intrafraction Patient Positioning Uncertainties for Intracranial Radiotherapy: A Study of Four Frameless, Thermoplastic Mask-Based Immobilization Strategies Using Daily Cone-Beam CT", Int J Radiat Oncol Biol Phys. 2011 May 80(1):281-90

[3] J.T. Springenberg, A. Dosovitskiy, T. Brox, M. Riedmiller, "Striving For Simplicity: The All Convolutional Net", ICLR Workshop 2015

[4] K. He, X. Zhang, S. Ren and J. Sun, "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification", Computer Vision and Pattern Recognition (CVPR 2015)

[5] S.H. Hasanpour, M.Rouhani, M. Fayyaz, M. Sabokrou and E. Adeli, "Towards Principled Design of Deep Convolutional Networks: Introducing SimpNet", CoRR 2018, ePrint 1802.06205

[6] X. Han, "MR-based Synthetic CT Generation using a Deep Convolutional Neural Network method", Med Phys. 2017 Apr 44(4):1408-1419

Figures

Fig.1. Summary of data acquisition

Fig.2 An example set of CT, four DIXON-MR images, automatically generated head mask and a CT after Crop-And-Flip operation. For Crop-And-Flip operation, the patient-left CT, MR and masks were cropped only while the patient-right counterparts were cropped and flipped.

Fig.3. Flowchart of the entire model. Some matrix sizes (which can be found upstream) were omitted for better presentation. The number of images is doubled at transition from 256x256x192 to 256x136x192, and halved from 256x136x192 to 256x256x192. During data augmentation, image rotation happened prior to entering the upper-left MR/CT blocks. “In” and “Out” labels in the convolutional kernels showed the input and output channels respectively.

Fig.4. Top: Quantitative results of the validation imageset. As a comparison, the training mean-absolute-differences of the 2nd, 3rd and 4th rows cases (with 8, 6 and 2 training imageset) were 79.66±12.35, 77.16±10.48 and 67.25±15.42 respectively. Bottom-Left: A coronal slice of pseudo-CT and the corresponding CT, comprise of air-cavity, bone and soft-tissues. Bottom-Right: The isosurface of 250HU of pseudo-CT and the corresponding CT. An external marker attached to the thermoplastic immobilization device is also visible over the frontal bone in this illustration.

Fig.5. The comparison between network design and training strategy for typical image processing applications and the proposed model. The network hyper-parameters were handled based on imaging prior information is highlighted in red. The proposed model was trained using ADAM optimizer with learning-rate cycling though 2^-10, 2^-11, 2^-12, 2^-13, 2^-14, 2^-15 and 2^-16. Optimization finished when training accuracy stopped lowering over 5000 iterations.

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)

0468