3812

Evaluating the performance of deep learning system for detecting focal liver lesions on contrast-enhanced MRI

Haoran Dai¹, Yuyao Xiao¹, Caixia Fu², Robert Grimm³, Heinrich von Busch⁴, Bram Stieltjes⁵, Moon Hyung Choi⁶, Chun Yang¹, and Mengsu Zeng^1,7
¹Department of Radiology, Zhongshan Hospital, Fudan University, Shanghai, China, ²MR Application Development, Siemens Shenzhen Magnetic Resonance Ltd, Shenzhen, China, ³MR Predevelopment, Siemens Healthineers AG, Erlangen, Germany, ⁴Digital & Automation Innovation, Siemens Healthineers AG, Erlangen, Germany, ⁵Universitätsspital Basel, Basel, Switzerland, ⁶Eunpyeong St. Mary’s Hospital, Catholic University of Korea, Seoul, Korea, Republic of, ⁷Shanghai Institute of Medical Imaging, Shanghai, China

Synopsis

Keywords: AI/ML Software, Machine Learning/Artificial Intelligence, focal liver lesions, Magnetic resonance imaging

Motivation: The number of focal liver lesions (FLLs) detected by imaging has increased worldwide, highlighting the need to develop a robust, objective system for automatically detecting FLLs.

Goal(s): This study aimed to evaluate the application value of deep learning based artificial intelligence (AI) software in detecting FLLs.

Approach: We compared the performance and agreement of deep learning based AI software with those of radiologists in detecting and evaluating malignant lesions in enhanced MRI of patients with FLLs.

Results: AI displayed effective detection performance for malignant lesions down to <10 mm. The measured size of malignant tumors was consistent with the pathologic and manual sizes.

Impact: Our results indicated that the use of AI might promote the detection ability of sub-centimeter-sized liver malignant lesions, providing a reference for selecting clinical treatment schemes.

Introduction

Magnetic resonance imaging (MRI) offers the most comprehensive noninvasive characterization of FLLs. Nonetheless, identifying all lesions can still be challenging and time-consuming because of the broad range of lesion appearance on MRI. Also, small lesions render detection more difficult. Early detection of malignant tumors is essential to improving the diagnosis and prognosis of patients with FLLs. Accurate tumor measurement and localization are also vital for enhancing the treatment effect of malignant tumors.Therefore, this study aimed to evaluate the performance of the deep learning–based artificial intelligence (AI) software in detecting and measuring lesions on enhanced MRI images in patients with FLLs.

Methods

This retrospective study enrolled 316 patients with 1030 FLLs (360 malignant and 670 benign). Contrast-enhanced Liver MRIs of these patients were conducted on a 1.5-T MR scanner (MAGNETOM Aera, Siemens Healthineers, Erlangen, Germany). The flowchart of patient inclusion is depicted in Figure 1.
The liver lesions were automatically detected and measured using a research liver AI software (MR Liver Analysis v2.0.0). In automatic liver lesion detection, pre-contrast, arterial-phase, portal-venous-phase, and delayed-phase T1-weighted images were uploaded to the software. After intra-study registration, the liver was automatically segmented into 8 subsegments, and the software automatically located, measured the diameters, and calculated the volumes of the lesions on the portal-venous phase.
For comparison, 2 radiologists with 5 and 7 years of experience manually detected and measured the liver lesions based on the pre-contrast and three other contrast-enhanced T1-weighted images in the PACS system. The average diameter was applied to minimize measurement bias. When there was a discrepancy between the two radiologists, a joint review was performed until a consensus was reached. The consensus recommendations of another 2 senior abdominal radiologists (with 21 and 24 years of experience) were used as the reference standard. The lesions were divided into 2 groups , and further subdivided into 4 subgroups of <10, 10-20, 20-40, and ≥40 mm, to verify the detection rates for lesions of various sizes. We selected 122 surgically resected lesions and compared their pathologic sizes with those measured by AI and radiologists. We calculated the sensitivity to evaluate the efficacy of AI and radiologists in detecting lesions and applied the McNemar test to determine any remarkable differences. The tumor sizes measured by AI, radiologists, and pathology were compared using Bland–Altman analyses, Friedman test, and intraclass correlation coefficients (ICCs).

Results

As illustrated in Figure 2, AI outperformed radiologists in detecting benign lesions (0.854 vs 0.807, P =.024) and all lesions (0.851 vs 0.813, P =.018). AI displayed higher sensitivity than radiologists in detecting all lesions <20 mm (0.848 vs 0.796, P =.006), malignant lesions <20 mm (0.843 vs 0.742, P =.019), and malignant lesions <10 mm (0.919 vs 0.676, P =.009). For lesions >20 mm, both AI and radiologists achieved excellent detection performance. The patients were divided into 3 groups according to the number of lesions: 1 lesion (71/316, 22.5%), 2-4 lesions (174/316, 55.1%), and ≥5 (71/316, 22.5%) (Fig. 3). AI had a similar detection ratio as radiologists for patients with a single tumor (0.915 vs 0.901, P =.771), but had a higher detection ratio than radiologists for patients with 2-4 tumors (0.948 vs 0.908, P =.031) and ≥5 tumors (0.901 vs 0.676, P =.001). AI had a higher detection rate than radiologists for lesions located in segments IV (95.89% vs 80.82%, P <.001) and V (96.61% vs 81.36%, P <.001); however, its ability to identify lesions in other liver segments was comparable to that of radiologists. A remarkable agreement existed between the average tumor sizes for the 3 measurements (P =.174) (Fig. 4 and Table 1).

Discussion/Conclusion

AI performed well in detecting malignant lesions down to <10 mm and multiple lesions, enhancing the detection of sub-centimeter-sized liver malignant lesions and avoiding missed diagnoses. For malignant lesions ≥40 mm, AI had a lower detection rate than radiologists (0.791 vs 0.930, P =.062), possibly due to the presence of multiple mixed components, such as hemorrhage, necrosis, fat, and so forth, which AI might have difficulty recognizing with high sensitivity. Moreover, AI could accurately locate liver lesions and assist doctors in their assessments. The lesion sizes measured by both AI and radiologists strongly correlated with the pathologic result, which was closely related to clinical practice in determining the optimal treatment choice. Further development of the AI software to enable the recognition of nonenhanced images, including T2W, DWI, and so forth, and characterization of lesions, holds clinical significance and promising prospects.
In conclusion, the deep learning–based AI software has practical value in automatically detecting liver lesions.

Acknowledgements

Declared none.

References

[1]Stollmayer, Róbert;Budai, Bettina Katalin;Rónaszéki, Aladár;Zsombor, Zita;Kalina, Ildikó;Hartmann, Erika;Tóth, Gábor;Szoldán, Péter;Bérczi, Viktor;Maurovich-Horvat, Pál;Kaposi, Pál Novák.Focal Liver Lesion MRI Feature Identification Using Efficientnet and MONAI: A Feasibility Study [J].Cells,2022,Vol.11(9): 1558

[2]Zhou LQ, Wang JY, Yu SY et al (2019) Artificial intelligence in medical imaging of the liver. World J Gastroenterol 25:672–682

[3]Chi Y, Zhou J, Venkatesh SK, et al. Computer-aided focal liver lesion detection. Int J Comput Assist Radiol Surg. 2013;8(4):511-525

[4]Strobel D, Bernatik T, Blank W, et al. Diagnostic accuracy of CEUS in the differential diagnosis of small (≤20 mm) and subcentimetric (≤10mm) focal liver lesions in comparison with histology. Results of the degum multicenter trial. Ultraschall Med.2011;32(6):593-597

[5]Jacobson FL (2020) Medical image perception research in the emerging age of artificial intelligence. Radiology 294:210–211

[6]Hill, C.E.; Biasiolli, L.; Robson, M.D.; Grau, V.; Pavlides, M. Emerging Artificial Intelligence Applications in Liver Magnetic Resonance Imaging. World J. Gastroenterol. 2021, 27, 6825–6843 [CrossRef]

[7]Cohen J. Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull 1968;70:213-220

[8]Amin MB, Edge SB, Greene F, Byrd DR, Brookland RK, Washington MK, Gershenwald JE, Compton CC, Hess KR, SullivanDC, Jessup JM, Brierley JD, Gaspar LE, Schilsky RL, Balch CM, Winchester DP, Asar EA, Madera M, Gress DM, Meyer LR.AJCC Cancer Staging Manual. 8 ed. 2017, New Yor: Springer.337-406

[9]Kansu L, Aydın E, Akkaya H, Avcı S, Akalın N. Shrinkage of Nasal Mucosa and Cartilage During Formalin Fixation. Balkan Med J 2017;34:458-463

[10]Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986;1:307-310

[11]Qingqing Chen, Yajing Zhu, Yinan Chen, Fang Wang, Xi Hu, Yuxiang Ye, Xin Dou, Yechong Huang, Liping Deng, Wei Zhou, Xiao Liang, Hongjie Hu.Applicability of multidimensional convolutional neural networks on automated detection of diverse focal liver lesions in multiphase CT images[J].Medical physics,2023,Vol.50(5): 2872-2883

[12]. Strobel D, Bernatik T, Blank W, et al. Diagnostic accuracy of CEUS in the differential diagnosis of small (≤20 mm) and subcentimetric (≤10mm) focal liver lesions in comparison with histology. Results of the degum multicenter trial. Ultraschall Med.2011;32(6):593-597

[13]Zhou, J.; Sun, H.C.; Wang, Z.; Cong, W.M.; Wang, J.H.; Zeng, M.S.; Yang, J.M.; Bie, P.; Liu, L.X.; Wen, T.F.; Han, G.H.; Wang, M.Q.; Liu, R.B.; Lu, L.G.; Ren, Z.G.; Chen, M.S.; Zeng, Z.C.; Liang, P.; Liang, C.H.; Chen, M.; Yan, F.H.; Wang, W.P.; Ji, Y.; Cheng, W.W.; Dai, C.L.; Jia, W.D.; Li, Y.M.; Li, Y.X.; Liang, J.; Liu, T.S.; Lv, G.Y.; Mao, Y.L.; Ren, W.X.; Shi, H.C.; Wang, W.T.; Wang, X.Y.; Xing, B.C.; Xu, J.M.; Yang, J.Y.; Yang, Y.F.; Ye, S.L.; Yin, Z.Y.; Zhang, B.H.; Zhang, S.J.; Zhou, W.P.; Zhu, J.Y.; Liu, R.; Shi, Y.H.; Xiao, Y.S.; Dai, Z.; Teng, G.J.; Cai, J.Q.; Wang, W.L.; Dong, J.H.; Li, Q.; Shen, F.; Qin, S.K.; Fan, J. Guidelines for diagnosis and treatment of primary liver cancer in China (2017 Edition). Liver Cancer, 2018, 7(3), 235-260

[14]Kudo, M.; Kawamura, Y.; Hasegawa, K.; Tateishi, R.; Kariyama, K.; Shiina, S.; Toyoda, H.; Imai, Y.; Hiraoka, A.; Ikeda, M.; Izumi, N.; Moriguchi, M.; Ogasawara, S.; Minami, Y.; Ueshima, K.; Murakami, T.; Miyayama, S.; Nakashima, O.; Yano, H.; Sakamoto, M.; Hatano, E.; Shimada, M.; Kokudo, N.; Mochida, S.; Takehara, T. Management of hepatocellular carcinoma in Japan: JSH consensus statements and recommendations 2021 update. Liver Cancer, 2021, 10(3), 181-223

[15]Lu, X.Y.; Xi, T.; Lau, W.Y.; Dong, H.; Xian, Z.H.; Yu, H.; Zhu, Z.; Shen, F.; Wu, M.C.; Cong, W.M. Pathobiological features ofsmall hepatocellular carcinoma: Correlation between tumor size and biological behavior. J. Cancer Res. Clin. Oncol., 2011, 137(4), 567-575

[16]Usta, S.; Kayaalp, C. Tumor diameter for hepatocellular carcinoma: Why should size matter? J. Gastrointest. Cancer, 2020, 51(4), 1114-1117

[17]Van Beers BE, Daire JL, Garteiser P (2015) New imaging techniques for liver diseases. J Hepatol 62(3):690–700. https:// doi. org/ 10. 1016/j. jhep.2014. 10. 014

[18] O’Neill EK, Cogley JR, Miller FH (2015) The ins and outs of liver imaging. Clin Liver Dis 19(1):99–121

[19]Amin MB, Edge SB, Greene F, Byrd DR, Brookland RK, Washington MK, Gershenwald JE, Compton CC, Hess KR, Sullivan DC, Jessup JM, Brierley JD, Gaspar LE, Schilsky RL, Balch CM, Winchester DP, Asar EA, Madera M, Gress DM, Meyer LR. AJCC Cancer Staging Manual. 8 ed. 2017, New York: Springer.337-406

[20]Chapiro J, Lin M, Duran R, Schernthaner RE, Geschwind JF (2015) Assessing tumor response after loco-regional liver cancer therapies: the role of 3D MRI. Expert Rev Anticancer Ther 15:199–205

[21]Chapiro J, Wood LD, Lin M et al (2014) Radiologic-pathologic analysis of contrast-enhanced and diffusion-weighted MR imaging in patients with HCC after TACE: diagnostic accuracy of 3D quantitative image analysis. Radiology 273:746–758

[22]Fowler KJ, Brown JJ, Narra VR (2011) Magnetic resonance imaging of focal liver lesions: approach to imaging diagnosis. Hepatology 54:2227–2237

[23]. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature.2015;521(7553):436-444

[24]Shen D,Wu G, Suk HI. Deep learning in medical image analysis. Annu Rev Biomed Eng. 2017;19:221-248

[25]Yang Q, Wei J, Hao X, et al. Improving B-mode ultrasound diagnostic performance for focal liver lesions using deep learning: a multicentre study. EBioMedicine. 2020;56:102777

[26]. Winkels M, Cohen TS. Pulmonary nodule detection in CT scans with equivariant CNNs.Med Image Anal. 2019;55:15-26

[27]Kooi T, Litjens G, van Ginneken B, et al. Large scale deep learning for computer aided detection of mammographic lesions.Med Image Anal. 2017;35:303-312

[28]Lee H, Yune S, Mansouri M, et al. An explainable deep-learning algorithm for the detection of acute intracranial haemorrhage from small datasets. Nat Biomed Eng. 2019;3(3):173-182

[29]Marrero JA, Ahn J, Rajender Reddy K. ACG clinical guideline:the diagnosis and management of focal liver lesions. Am J Gastroenterol. 2014;109(9):1328-1347. quiz 1348

Figures

Flow chart of patient inclusion and exclusion.

Comparison of detection sensitivity between the AI and radiologists for lesions of various sizes. The lesions were divided into <20 (a) and ≥20 mm (e), and further subdivided into 4 subgroups of <10 (c), 10-20 (d), 20-40 (f), and ≥40 mm (g).Benign: all types of benign lesions; malignancy: all types of malignant lesions; total: all types of lesions. ** and * indicate P <.01 and P <.05, respectively.

Comparison of detection ratio per patient between AI and radiologists. Patients were divided into 3 groups based on the number of lesions: 1 lesion, 2-4 lesions, and ≥5 lesions. The points above 0 indicate that AI had a higher detection ratio than radiologists. The points at 0 indicate that, for these patients, the detection rates of the 2 methods were the same. The points below 0 indicate that radiologists had a higher detection rate than AI.

Comparisons of tumor sizes among AI, radiologists, and pathology. AI, Artificial intelligence; CI, confidence interval; ICC, intraclass coefficient; SD, standard deviation.

Bland–Altman plots for tumor size measured by AI or radiologists compared with the size of pathologic specimens. Fig. A and Fig.B represent the variability of tumor size measurements and corresponding pathological sizes by AI and physicians, respectively. AI, Artificial intelligence.

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)

3812

DOI: https://doi.org/10.58530/2024/3812