A novel lesion-wise metric was developed to evaluate the quality of predictive models of prostate cancer that use quantitative multiparametric MR data to perform prediction on a voxel-wise basis. The metric is based on the Jaccard similarity coefficient and emphasizes overlap and co-localization of ground truth and predicted lesions. Experiments to characterize the metric demonstrated that it qualitatively reflected the goodness of predictions and was more accurate and informative than voxel-wise measures of sensitivity and specificity. We propose that the metric may be customized to select the best predictive models for specific clinical applications such as performing targeted prostate biopsies.
First, cancer voxels in ground truth (TR) and prediction (PRED) maps were grouped into discrete lesions. This was accomplished by performing binary dilation, labeling connected voxels, and then applying the masks of the original maps. For PRED, median filtering was also performed beforehand (Fig. 1a-d). A size threshold to eliminate small lesions was also applied with the rationale that they are likely to represent benign, clinically-insignificant disease (Fig. 1e).7
Next, associations between lesions in TR ($$$\ell_{tr}$$$) and lesions in PRED ($$$\ell_p$$$) were determined. For a given $$$\ell_{tr}$$$, an $$$\ell_p$$$ is associated with $$$\ell_{tr}$$$ if they are sufficiently close to each other (e.g., separated by <5 voxels), and is overlapping with $$$\ell_{tr}$$$ if any voxel is labeled cancer in both (Fig. 1f). For each $$$\ell_{tr}$$$, all associated $$$\ell_p$$$s were found with the condition that each $$$\ell_p$$$ is associated with at most one $$$\ell_{tr}$$$. In the case where $$$\ell_p$$$ overlaps with $$$n>1$$$ lesions in TR, $$$\ell_p$$$ is divided into $$$n$$$ lesions such that voxels of partition $$$i$$$ are closest to lesion $$$i$$$ in TR.
After these pre-processing steps, a lesion-wise score $$$s_\ell$$$ was calculated for each $$$\ell_{tr}$$$ (Fig. 1g). $$$s_\ell$$$ was designed to satisfy the following:
1) $$$0 \leq s_\ell \leq 1$$$, with $$$s_\ell=0$$$ when no $$$\ell_p$$$s overlap and $$$s_\ell=1$$$ when $$$\ell_p=\ell_{tr}$$$.
2) $$$s_\ell$$$ increases as overlap and co-localization between $$$\ell_{tr}$$$ and associated $$$\ell_p$$$s improve, where co-localization is quantified by $$$d$$$, the distance between their centroids.
$$$s_\ell$$$ is based on the Jaccard similarity coefficient $$$J_c$$$8 (Fig. 2, Eq. 1-2) with two modifications that account for co-localization. The first is a weighting function $$$\omega$$$ (Fig. 2, Eq. 3) that weights the voxels of $$$\ell_{tr}$$$ such that voxels closer to the centroid contribute more heavily to $$$s_\ell$$$ than those at the periphery, which rewards co-localization of TPs (Fig. 3a). The second is a distance penalty function $$$g(d)$$$ that penalizes poor co-localization of both TPs and FPs (Fig. 3b).
$$$s_\ell$$$ may be used in multiple ways. For example, by thresholding $$$s_\ell$$$, the number of lesions detected can be calculated. Additionally, a slice-wise score $$$s_s$$$ can be obtained by averaging all the $$$s_\ell$$$s (Fig. 2, Eq. 4) for a given slice (Fig. 1g).
To characterize the proposed metrics and compare them to voxel-wise metrics, PREDs were synthesized that achieve either target sensitivity and specificity, or target $$$s_l$$$ and/or $$$s_s$$$ on 46 TRs (obtained from our previous work6). In the pre-processing step, a size threshold of 50 voxels was applied. For $$$s_\ell$$$, constants $$$a_\omega=1.2$$$, $$$a_1=7$$$, and $$$a_2=1.05$$$ were chosen.
Figure 4a demonstrates how changing the overlap between $$$\ell_{tr}$$$ and $$$\ell_p$$$ affects $$$s_\ell$$$. Figure 4b shows representative PREDs and $$$s_\ell$$$s for a given TR.
$$$s_s$$$ was calculated over the 46 TRs and averaged across 100 synthetically-generated PREDs for varying voxel-wise sensitivity/specificity pairs (Table 1). Lesion detection statistics were calculated using a threshold of $$$s_\ell=0.5$$$ (Table 2). In general, improvements in specificity increased $$$s_s$$$ more than improvements in sensitivity. This is because there are typically many more non-cancer voxels while $$$s_\ell$$$ and $$$s_s$$$ penalize FNs and FPs equally.
1. Hricak H. MR imaging and MR spectroscopic imaging in the pre-treatment evaluation of prostate cancer. The British journal of radiology. 2005;78 Spec No 2:S103-11. doi: 10.1259/bjr/11253478.
2. Xu S, Kruecker J, Turkbey B, Glossop N, Singh AK, Choyke P, Pinto P, Wood BJ. Real-time MRI-TRUS fusion for guidance of targeted prostate biopsies. Computer aided surgery: official journal of the International Society for Computer Aided Surgery. 2008;13(5):255-64. doi: 10.3109/10929080802364645.
3. Chan I, Wells W, 3rd, Mulkern RV, Haker S, Zhang J, Zou KH, Maier SE, Tempany CM. Detection of prostate cancer by integration of line-scan diffusion, T2-mapping and T2-weighted magnetic resonance imaging; a multichannel statistical classifier. Med Phys. 2003;30(9):2390-8.
4. Vos PC, Hambrock T, Hulsbergen-van de Kaa CA, Futterer JJ, Barentsz JO, Huisman HJ. Computerized analysis of prostate lesions in the peripheral zone using dynamic contrast enhanced MRI. Med Phys. 2008;35(3):888-99.
5. Tiwari P, Kurhanewicz J, Madabhushi A. Multi-kernel graph embedding for detection, Gleason grading of prostate cancer via MRI/MRS. Medical image analysis. 2013;17(2):219-35. doi: 10.1016/j.media.2012.10.004.
6. Metzger GJ, Kalavagunta C, Spilseth B, Bolan PJ, Li X, Hutter D, Nam JW, Johnson AD, Henriksen JC, Moench L, Konety B, Warlick CA, Schmechel SC, Koopmeiners JS. Detection of Prostate Cancer: Quantitative Multiparametric MR Imaging Models Developed Using Registered Correlative Histopathology. Radiology. 2016;279(3):805-16. doi: 10.1148/radiol.2015151089.
7. Rais-Bahrami S, Turkbey B, Rastinehad AR, Walton-Diaz A, Hoang AN, Siddiqui MM, Stamatakis L, Truong H, Nix JW, Vourganti S, Grant KB, Merino MJ, Choyke PL, Pinto PA. Natural history of small index lesions suspicious for prostate cancer on multiparametric MRI: recommendations for interval imaging follow-up. Diagn Interv Radiol. 2014;20(4):293-8. doi: 10.5152/dir.2014.13319.
8. Levandowsky M, Winter D. Distance between Sets. Nature. 1971;234(5323):34-35. doi: 10.1038/234034a0.