4304

Radiomic feature-based assessment of deep learning-based compressed sensing reconstruction
Tomoki Miyasaka1, Satoshi Funayama2, Daiki Tamada2, Hiroyuki Morisaka2, Hiroshi Onishi2, and Yasuhiko Terada1
1Graduate School of Science and Technology, University of Tsukuba, Tsukuba, Japan, 2Department of Radiology, University of Yamanashi, Chuo, Japan

Synopsis

Deep learning has been attracting attention as a new tool for image reconstruction. However, there is a lack of appropriate automatic evaluation metrics for reconstruction performance of small structures such as lesions, which poses a high hurdle for clinical application. Here, we explored the relationship between radiomic features of tumors and various DL reconstruction conditions, and proposed a new method based on radiomics to evaluate the reconstruction performance of DL against lesions. Based on the analysis using the concordance correlation coefficients for ground truth images, we explored several texture features that are sensitive to differences in reconstruction methods and conditions.

INTRODUCTION

Deep learning-based compressed sensing (DL-CS) reconstruction has the potential to outperform existing methods based on sparse regularization. However, its clinical application remains a challenge, because DL-CS may have instabilities in image reconstruction [1], for example, small structural changes such as tumors may not be captured in the reconstructed image. Therefore, it is important to accurately evaluate image restoration quality for individual lesions, but it cannot be performed using traditional numeric metrics such as peak signal-to-noise ratio (PSNR) and structural similarity (SSIM), which measure the average accuracy of the entire image.
Here we propose a new strategy for evaluating DL-CS image reconstruction in terms of lesion restoration accuracy using radiomic features. Radiomics is a method to characterize lesions (e.g. tumor phenotypes) using a large number of quantitative image features [2]. In this study, we applied radiomics to texture analysis of lesions in reconstructed images. We assessed the sensitivity of the radiomic features computed from restored lesions by varying CS-deep neural networks (DNNs), acceleration factors (AFs), and sampling patterns, and to find the suitable radiomic features for evaluation of the DL-CS reconstruction performance.

METHODS

Dataset
A GE 3T SIGNA Premier and a 48-channel coil were used to acquire raw data of brain multi-slice 2D FLAIR. The number of phase encodings was 192, acceleration factors (AFs) were 3 and 4, and data were undersampled retrospectively with random and regular patterns. A total of 2536 slices from 97 cases were used for training and 651 slices from 25 cases were used for testing.
CS-deep neural networks (DNNs)
We used one non-DL reconstruction [parallel imaging-CS reconstruction (PICS) with L1 regularization term [3]], and three DL reconstructions [variational network (VN) [4], MoDL [5], and UNet [6]]. For VN, the epoch was 50 and batch size was 2. For MoDL, the epoch was 50 and batch size was 1. For UNet, the epoch was 100 and batch size was 2.
Numeric metrics test
As numerical indices of image quality, the mean values of PSNR and SSIM in all cases were calculated.
Texture analysis
We manually segmented 34 randomly selected rectangular areas surrounding lesions (2 for each patient) from 17 of the cases used for testing. For each reconstructed segmentation image, 93 radiomic features were calculated using pyradiomics [7]. After the radiomic features were z-score transformed, the concordance correlation coefficients (CCCs) were used to evaluate the concordance of radiomic features between each reconstructed images and ground truth (GT) images. A CCC value of 0.8 was used as the threshold to assess the reproducible radiomic feature, indicating the good agreement of the reconstructed image with the GT images, and the numbers of the features above the threshold were counted for each condition.

RESULTS

Comparison of reconstructed images (Figs. 1 and 2) revealed the overall perceptual trend of the reconstruction performance between the different DNNs, AFs, and sampling patterns. The average SSIM and PSNR values were lowest for UNet and higher for MoDL and VN, and were lower for larger AFs and higher for regular sampling than for random sampling.
The radiomic feature maps computed for 34 lesions in the GT images is depicted in Fig. 3. The CCC heat maps of radiomic features computed from the reconstructed images are depicted in Figs. 4 and 5. In these heat maps, the brighter the red color, the higher the CCC value and the higher the similarity with the GT images. The sensitivity of radiomics features to different reconstruction conditions was varied, with some features having higher sensitivity (fast order and GLCM features) and others having lower sensitivity (GLSZM, GLRLM, and GLDM features). These sensitive CCCs showed the same perceptual tendencies to the reconstruction condition as seen in Figures 1 and 2. The exception was the CCC values for the UNet reconstruction, where almost all texture features were low compared with the other reconstruction methods.
The numbers of the reproducible radiomic features were higher for VN and MoDL than for UNet, higher for AF3 than for AF4, and higher for the regular sampling than for the random sampling.

DISCUSSION

The CCC heatmap results showed that the fast order and GLCM texture features were strongly affected by the reconstruction condition, and that the trend was plausible and consistent with perceptual trends and PSNR and SSIM. UNet is a single-coil data-driven network, while VN and MoDL are multicoil model-based networks, and in general, the latter have been found to show higher performance than the former. It is important to note that this trend is also observed in fast order and GCLM radiomics features. The number of reproducible radiomic features could be a good measure of the restoration quality of the lesions in the CS-DL images.

CONCLUSION

We found that the fast order and GCLM radiomics features were sensitive to difference in the reconstruction condition, and the CCC analysis can be used to measure reconstruction performance under different reconstruction conditions.

Acknowledgements

No acknowledgement found.

References

[1] Antun, V., et al., On instabilities of deep learning in image reconstruction and the potential costs of AI, Natl. Acad. Sci., 117(48): 30088-30095, 2020.

[2] Zhao, B., et al., Reproducibility of radiomics for deciphering tumor phenotype with imaging, Sci. Rep., 6(23428), https://doi.org/10.1038/srep23428, 2016.

[3] Uecker. M., et al., BART Toolbox for Computational Magnetic Resonance Imaging, Zenodo, DOI: 10.5281/zenodo.592960.

[4] Hammernik, K., et al., Learning a Variational Network for Reconstruction of Accelerated MRI Data, Magn. Reson. Med., 79(6): 3055-3071, 2018.

[5] Aggarwal, H. K., et al., MoDL ; Model Based Deep Learning Architeccture for Inverse Problems, IEEE Trans. Med. Imaging, 38, 394-405, 2019.

[6] Ronneberger, O., et al., U-Net ; Convolutional Networks for Biomedical Image Segmentation, MICCAI 2015, 9351: 234-241, 2015.

[7] van Griethuysen, J. J. M., et al., Computational Radiomics System to Decode the Radiographic Phenotype. Cancer Research, 77(21), e104–e107, https://doi.org/10.1158/0008-5472.CAN-17-0339, 2017.

Figures

Fig.1 Comparison of reconstructed images in random sampling. AF: acceleration factor. PICS: parallel imaging-CS reconstruction. VN: vatiational network.

Fig.2 Comparison of reconstructed images in regular sampling.

Fig.3 The radiomic feature maps computed for 34 lesions in the Ground truth images. GLCM: Gray Level Co-occurrence Matrix. GLSZM: Gray Level Size Zone Matrix. GLRLM: Gray Level Run Length Matrix. NGTDM: Neighbouring Gray Tone Difference Matrix. GLDM: Gray Level Dependence Matrix.

Fig.4 CCC heat map of radiomic features in random sampling. (a) CCC heat map at AF = 3. (b) CCC heat map at AF = 4. (c) number (percentage) of reproducible features with CCC > 0.80 at AF = 3. (d) number (percentage) of reproducible features with CCC > 0.80 at AF = 4.

Fig.5 CCC heat map of radiomic features in regular sampling. (a) CCC heat map at AF = 3. (b) CCC heat map at AF = 4. (c) number (percentage) of reproducible features with CCC > 0.80 at AF = 3. (d) number (percentage) of reproducible features with CCC > 0.80 at AF = 4.

Proc. Intl. Soc. Mag. Reson. Med. 30 (2022)
4304
DOI: https://doi.org/10.58530/2022/4304