Wu Zhou1, Hui Huang1, Guangyi Wang2, and Honglai Zhang1
1School of Medical Information Engineering, Guangzhou University of Chinese Medicine, Guangzhou, China, 2Department of Radiology, Guangdong General Hospital, Guangzhou, China
Synopsis
Convolutional
neural network (CNN) has been regarded to be powerful for lesion characterization in
clinical practice. However, local deep feature derived from CNN has two main
shortcomings for characterization. First, the convolutional operations typically
process within a local neighborhood while ignoring the global dependency. Furthermore,
it is unstable to small perturbations in images (e.g., noise or artifacts). Therefore,
we propose a denoised local fusion and nonlocal deep feature fusion method to alleviate
the above two problems. The proposed method is a general module, which can be
integrated into any CNN-based architecture for improving performance of lesion characterization
in clinical routine.
INTRODUCTION
Malignancy
of Hepatocellular carcinoma (HCC) is an important prognostic factor that
affects recurrence and survival after liver transplantation or surgical
resection in clinical practice1. Convolutional neural network (CNN) has been
regarded to have powerful ability to characterize the biological aggressiveness
of HCC2. However, the convolutional operations typically process within a
local neighborhood while ignoring the global dependency, which may lower the
capability of deep feature derived from CNN for lesion characterization3. Furthermore,
small perturbations to images (e.g., noise or artifacts) may lead to
substantial noise in deep feature of CNN systems so as to make incorrect
predictions due to the variations and complexities of image content4. In
this work, inspired by the work of nonlocal neural network in computer vision5,
we propose a denoised local deep feature and nonlocal deep feature fusion method
based on the nonlocal neural network in order to rise to the above two challenges for
lesion characterization.METHODS
This study has been approved by the local
institutional review board, and the informed consent of patients was waived. 115
pathologically confirmed HCC lesions in 112 patients from October 2012 to
October 2018 were included in this retrospective study. All MR examinations were
performed using a 3.0Tesla(T) MR scanner, in which unenhanced, arterial, portal venous, and delayed
phase images were acquired using the breath-hold Axial LAVA+C sequence. Of the 115
HCCs, fifty-four low-grade tumors correspond to Edmondson grade I and II, and sixty-one high-grade tumors correspond to Edmondson grade III and IV. Figure 1 shows the
proposed denoised local and nonlocal deep feature fusion framework, which is
mainly composed of three modules: denoised local feature extraction, non-local
feature extraction and bilinear kernel fusion. For the procedure of
denoised local deep feature extraction, a
non-local denoising block is conducted to generate the denoised local features. Then, the nonlocal deep feature is
effectively extracted from the original tumor images by the nonlocal feature
extraction block. Finally, the denoised local deep feature and the nonlocal
features are fused by the bilinear kernel method to output the classification
result. Note that the
training (75 HCC) and independent testing (40 HCC) data were obtained from the arterial phase and were repetitively performed five times in
order to reduce measurement errors. Values of the metric including the
accuracy, sensitivity and specificity in the five repetitive tests were calculated
and denoted as the representation of mean±std.RESULTS
Table
1 showed the characterization performance of different models for malignancy
characterization of HCC. Compared with local feature generated by the CNN
baseline 2, the GAP CNN with global feature3 and the Nonlocal CNN5 have
better results. Specifically, the Nonlocal
CNN outperforms the GAP CNN for global information extraction. Finally, the denoised CNN4 obtains largest
performance gain over the CNN baseline. As shown in Table 2, we can find that
all the fusion methods yield improved performance compared with the single CNN, while the proposed denoised local and nonlocal feature fusion by the bilinear kernel
method results in highest performance. Figure 2 shows a set of the feature maps in different models for two
representative HCC samples. We can clearly observe that the local feature maps
are smoothed and high-lighted by the nonlocal operations to generate denoised
local deep feature. Figure 3 shows the loss curves and the corresponding accuracy
curves for different models. Our
proposed model (NL+FD (Bilinear)) exhibits consistent better performance than the
CNN baseline2, nonlocal feature(NL) 5, denoised local feature(FD+CNN)4, and the local and nonlocal feature fusion (NL+CNN(C+I)) 6throughout the testing procedure.DISCUSSION
The
present study suggests that the global information can
obtained better performance than the local deep feature for lesion
characterization. As the global deep
feature reflects the global dependency of features within the lesion region, it is supposed
to be more representative and robust to large variations and complexities in
image content. We further demonstrate that the denoising of local deep feature outperforms
the local deep feature for lesion characterization. To our best knowledge, the present study might be the first work to indicate the significance of local feature denoising in deep learning for lesion characterization. As the process of lesion
imaging is generally subject to noise and image artifacts, deep features
derived from CNN might be contaminated by those perturbations in medical
images. More importantly, the visualization of feature maps in
the model indicates that the denoised local feature map exhibits salient
regions and noise reduction while suppressing unnecessary features. In addition,
our study also demonstrates that the fusion of denoised local feature and the
nonlocal feature yields best results for lesion characterization, verifying that the local information and global information are complementary for improving characterization performance of CNN 3,6.CONCLUSION
Our
present study demonstrates that the denoising of the local feature, the
extraction of global information can both yield better performance than the
conventional CNN baseline for malignancy characterization of HCC. Furthermore, the
fusion of denoised local and nonlocal deep feature by the bilinear kernel
method can achieve best results for lesion characterization, outperforming
several recently proposed methods. The proposed method is a general module,
which can be integrated into any CNN-based architecture for improving performance
of lesion characterization in clinical routine.Acknowledgements
This research is sponsored by the grant from National Natural Science Foundation of China (81771920).
References
[1]
Haratake J, Takeda S, Kasai T, Nakano S, Tokui N. Predictable factors for
estimation prognosis of patients after resection of hepatocellular carcinoma.
Cancer 1993;72: 1178-1183.
[2]
Zhou W, Wang G, Xie G, Zhang L. Grading of hepatocellular carcinoma based on
diffusion weighted images with multiple b-values using convolutional neural
networks. Medical Physics 2019;46(9): 3951-3960.
[3]
Qiu Z, Yao T, Ngo C, Tian X, Mei T. Learning Spatial-temporal representation
with local and global diffusion. Proc. IEEE Int. Conf. Computer Vision and
Pattern Recognition (CVPR) 2019:12056-12065.
[4]
Xie C, Wu Y, Maaten L, Yuille A, He K. Feature Denoising for Improving
Adversarial Robustness. Proc. IEEE Int. Conf. Computer Vision and Pattern
Recognition (CVPR) 2019:501-509.
[5]
Wang X, Girshick R, Gupta A, He K. Non-local Neural Networks. Proc. IEEE Int.
Conf. Computer Vision and Pattern Recognition (CVPR) 2018:7794-7803.
[6] Dou T, Zhang L, Zheng H, Zhou W. Local and
nonlocal deep feature fusion for malignancy characterization of
hepatocellular carcinoma. Int. Conf. Medical Image Computing and
Computer Assisted Intervention, 2018: 472-479.