Zhaonan Sun1, Xiaoying Wang2, and Kexin Wang3
1Department of Radiology, Peking University First Hospital, Beijing, China, 2Department of Radiology, Peking University First Hospital, Beijing, China, 3School of Basic Medical Sciences, Capital Medical University, Beijing, China
Synopsis
Keywords: Machine Learning/Artificial Intelligence, Prostate
A total of 557 mpMRI data were retrospectively collected from three hospitals to build an external validation dataset, with 245 csPCa cases and 312 non-csPCa cases. The csPCa lesions were annotated based on pathology records by two experienced radiologists. AI algorithms were used to automatically detect and localize the suspicious csPCa areas on the T2WI and ADC maps. The metrics of sensitivity, specificity, and accuracy were used to evaluate the diagnostic efficacy of the AI algorithms at the lesion level, the sextant level, and the patient level.
Abstract Body
Background: An increasing number of patients with suspected clinically significant prostate cancer (csPCa) are undergoing prostate multiparametric MRI (mpMRI). The role of AI algorithms in interpreting prostate mpMRI needs to be tested with multicenter external data.Objectives: To evaluate the diagnostic efficacy of AI algorithms in the detection and localization of csPCa on prostate mpMRI.Methods: A total of 557 mpMRI data were retrospectively collected from three hospitals to build an external validation dataset, with 245 csPCa cases and 312 non-csPCa cases. All patients were proven pathologically by image-guided biopsy, transurethral prostatectomy, or radical prostatectomy and without any therapy. The csPCa lesions were annotated based on pathology records by two experienced radiologists. Previously trained AI algorithms were used in this study, including (a) MRI sequence selection, (b) prostate gland segmentation, (c) prostate zonal anatomy segmentation, and (d) csPCa foci segmentation. With the detected suspicious areas, the number of suspicious lesions, the three-dimensional diameter, the volume, and the average ADC value were calculated. At the lesion level, we considered each connected domain that was predicted by the AI algorithms as a predicted lesion. We studied the largest four lesions for each patient. The lesions predicted by the AI algorithms were compared with the reference standard. The spatial overlap between the predicted lesion and reference standard was characterized by using the mean Dice similarity coefficient. If there was overlap (Dice similarity coefficient > 0) between the two areas, it was considered a true positive (TP) lesion. If an AI-predicted lesion did not overlap with the reference standard, it was considered a false positive (FP) lesion. If a reference lesion did not overlap with any AI-predicted lesions, it was considered a false negative (FN) lesion. At the sextant level, we studied the areas of the right-upper, right-middle, right-lower, left-upper, left-middle, and left-lower areas of the prostate gland separately. We considered the overlap of the lesions with the sextant area to be positive. Thus, if a sextant overlapped with both the reference lesion and AI-predicted lesion, it was considered a TP sextant. If a sextant overlapped only with the reference lesion, it was considered an FN sextant. If a sextant overlapped only with the AI-predicted lesion, it was considered an FP sextant. If a sextant did not overlap either the reference lesion or the AI-predicted lesion, it was considered a true negative (TN) sextant. At the patient level, we considered the patient with any TP sextant as TP. If all sextants were TN, the patient was considered as a TN case. If the patient had TN and FP without TP, the patient was considered as an FP case. If the patient had TN and FN without TP, the patient was considered as an FN case. The sensitivity of the AI algorithms was calculated at the lesion level to evaluate the AI’s ability to detect the lesions. The sensitivity, specificity, and accuracy were calculated at the sextant level to evaluate the AI’s ability to localize the lesions, and the metrics were compared among the six sextants. The sensitivity, specificity, and accuracy were calculated at the patient level to evaluate the AI’s ability to identify the candidate for biopsy. The AI accuracy at the patient level was also studied with the factors of the tumor number, volume, and average ADC value.Results: There were 434 cancer foci in the 245 csPCa patients, in which 284 cancer foci of 231 patients were correctly detected by the AI algorithms. Thus, the sensitivity of AI algorithms at the lesion level was 0.654. There were 3342 sextant areas in the 557 patients, of which 881 were positive for csPCa. Thus, the overall sensitivity, specificity, and accuracy of AI algorithms at the sextant level were 0.846, 0.884, and 0.874, respectively. For each type of sextant, namely, the right-upper, right-middle, right-lower, left-middle, and left-lower sextants, the sensitivity, specificity, and accuracy were 0.772-0.902 (P = 0.013), 0.851-0.915 (P = 0.016), and 0.858-0.894 (P = 0.342), respectively. At the patient level, the sensitivity, specificity, and accuracy for the detection of csPCa patients were 0.943, 0.776, and 0.849, respectively. The AI-predicted accuracy of csPCa patients (231/245, 0.943) was significantly higher than that of non-csPCa patients (70/242, 0.776) (P < 0.001). In correctly diagnosed patients, the lesion number and the tumor volume were greater than those in incorrectly diagnosed patients (0 [0,1] vs. 0 [0,0]; 0.00 [0.00, 0.00] vs. 0.00 [0.00, 2.355] cm3, both P < 0.001). In the positive patients, the patients with lower average ADC values were more correctly diagnosed than those with higher average ADC values (0.750 [0.643, 0.866] vs. 0.884 [0.765, 0.966] *10-3 cm2/s, P = 0.011).Conclusion: The AI algorithms achieved acceptable accuracy for the detection and localization of csPCa at the patient level and the sextant level. However, the sensitivity at the lesion level should be improved.Acknowledgements
NoReferences
No