4963

Spatially-sensitive model for detection of prostate cancer on multiparametric MRI

Ethan Leng¹, Jin Jin², Lin Zhang², Christopher A. Warlick³, Benjamin Spilseth⁴, Joseph S. Koopmeiners², and Gregory J. Metzger¹

¹Center for Magnetic Resonance Research, University of Minnesota, Minneapolis, MN, United States, ²Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, ³Department of Urologic Surgery, Institute of Prostate and Urologic Cancers, University of Minnesota, Minneapolis, MN, United States, ⁴Department of Radiology, University of Minnesota, Minneapolis, MN, United States

Synopsis

A novel predictive model of prostate cancer (PCa) on multiparametric MRI was developed that takes into account the spatial distribution of PCa within the prostate and the spatially-autocorrelated nature of mpMRI data. The performance of the proposed model was compared to the LASSO-based model we previously described on 34 PCa cases using both voxel-wise metrics (AUC) and slice-wise metrics ($$$s_s$$$) we recently developed. The proposed model achieved superior predictive performance both in terms of AUC (0.81 vs 0.77) and $$$s_s$$$ (0.45 vs. 0.35) over the 34 cases, with significant improvements for the majority of cases.

Rationale

There has been interest recently in using quantitative multiparametric MRI (mpMRI) for detecting prostate cancer (PCa) and assessing the clinical significance of disease. One major focus is the development of computational models that use mpMR data to make predictions about PCa status.^1-4 Previously, we described a LASSO-based 5-parameter model that demonstrated superior voxel-wise classification performance as compared to any single MR parameter.⁵ Here, we describe a novel model that improves upon our existing work by employing a machine-learning framework with corrections that address the spatial effects present in the mpMR data.

Methods

Data used for model development were acquired following protocols we previously described.⁵ Briefly, patients received mpMR scans at 3T under an approved IRB protocol. Histopathology data were obtained through processing of post-surgical ex vivo prostate specimens from patients that underwent radical prostatectomy. MR slices containing index lesions were then co-registered to the pathology data. 34 patient cases containing 46 slices of interest and 100,000+ voxels were included for modeling. For each voxel, its mpMR values were used as modeling inputs (features), while its cancer status (label) was used as the ground truth. The five mpMR parameters used for modeling were the tissue T2 value, ADC value, and three DCE-MRI parameters derived from pharmacokinetic curve fitting (K_trans, K_ep, AUGC).

In our previous work, it was implicitly assumed that voxels could be treated as if they were drawn from the same joint probability distribution, meaning that the expected parameter values and PCa likelihood for a given voxel is the same as that of any other voxel. However, this assumption is untrue because expected parameter values and PCa likelihood for a given voxel depend on those of its neighbors and also vary among the anatomical regions of the prostate (highest likelihood in the peripheral zone).⁶

To account for these spatial dependencies, the features were augmented in two ways. First, the relative xy-positions of each voxel were included as additional features. This provides information about the location of the voxel within the prostate, which provides a surrogate for anatomical location without the need to perform segmentation. Second, mpMR parameters of all immediately adjacent neighbors of a voxel were included as additional features. This is a simple way to incorporate spatial autocorrelation into the model. Altogether, each voxel was associated with 47 features (Fig. 1).

The machine-learning algorithm used to perform classification is gradient-boosted decision trees (GBDTs), which was implemented in Python through the Xgboost package.⁷ It is an ensemble learning approach that starts with a weak decision tree and iteratively improves its fit to the training data (boosting). The final prediction is then a weighted sum of the predictions of all of the decision trees (Fig. 2). This algorithm was primarily selected for its fast training times and superior results in many machine-learning competitions.⁷

The performance of the proposed GDBTs-based model vs. that of the LASSO-based model we previously described was compared in two ways. Area under the receiver operating curve (AUC), a balanced measure of voxel-wise sensitivity and specificity, was calculated for all cases. A slice-wise metric based on the overlap between distinct foci of PCa (or lesions) in the prediction and ground truth ($$$s_s$$$) that we recently developed (Fig. 3) was also calculated for all cases.

Results

For both models, a case-based leave-one-out cross-validation scheme was used to train and assess performance; for a given PCa case, the models were first trained using the data from the other 33 cases, then evaluated on the case itself. Table 1 shows the cumulative AUC and $$$s_s$$$ across all cases for both models.

In terms of AUC, the proposed model performed significantly better (ΔAUC ≥ 0.05) on 19/34 cases. Similarly, in terms of $$$s_s$$$, the proposed model performed significantly better (Δ$$$s_s$$$ ≥ 0.1) on 25/34 cases. Although both metrics were generally in agreement, Figure 4 illustrates representative prediction maps for specific slices in which AUC and $$$s_s$$$ differed between the two models. It appeared that the new evaluation metric $$$s_s$$$ is superior to AUC at accurately reflecting the differences in the quality of predictions.

Discussion

Incorporating features that account for the spatial characteristics of mpMRI data improved predictive performance as measured by both voxel-wise and lesion-wise metrics. However, as Figure 4 illustrates, there exists significant variation in model performance among different cases. This is most likely explained by differences in the joint probability distribution of mpMR parameters and PCa likelihood among different patients and different grades of PCa, which we plan to address in future work.

Acknowledgements

Supported by: NCI R01 CA155268, NIBIB P41 EB015894, DOD/PCRP W81XWH-15-1-0477, MN-REACH.

References

1. Chan I, Wells W, 3rd, Mulkern RV, Haker S, Zhang J, Zou KH, Maier SE, Tempany CM. Detection of prostate cancer by integration of line-scan diffusion, T2-mapping and T2-weighted magnetic resonance imaging; a multichannel statistical classifier. Med Phys. 2003;30(9):2390-8.

2. Artan Y, Haider MA, Langer DL, van der Kwast TH, Evans AJ, Yang Y, Wernick MN, Trachtenberg J, Yetik IS. Prostate cancer localization with multispectral MRI using cost-sensitive support vector machines and conditional random fields. IEEE transactions on image processing: a publication of the IEEE Signal Processing Society. 2010;19(9):2444-55. doi: 10.1109/tip.2010.2048612.

3. Tiwari P, Kurhanewicz J, Madabhushi A. Multi-kernel graph embedding for detection, Gleason grading of prostate cancer via MRI/MRS. Medical image analysis. 2013;17(2):219-35. doi: 10.1016/j.media.2012.10.004.

4. Litjens G, Debats O, Barentsz J, Karssemeijer N, Huisman H. Computer-aided detection of prostate cancer in MRI. IEEE Trans Med Imaging. 2014;33(5):1083-92. doi: 10.1109/tmi.2014.2303821.

5. Metzger GJ, Kalavagunta C, Spilseth B, Bolan PJ, Li X, Hutter D, Nam JW, Johnson AD, Henriksen JC, Moench L, Konety B, Warlick CA, Schmechel SC, Koopmeiners JS. Detection of Prostate Cancer: Quantitative Multiparametric MR Imaging Models Developed Using Registered Correlative Histopathology. Radiology. 2016;279(3):805-16. doi: 10.1148/radiol.2015151089.

6. Kumar V, Abbas AK, Aster JC. Robbins & Cotran Pathologic Basis of Disease. 9th ed. Philadelphia: Elsevier Saunders; c2015. 1408 p.

7. Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016:785-94. doi: 10.1145/2939672.2939785.

8. Levandowsky M, Winter D. Distance between Sets. Nature. 1971;234(5323):34-35. doi: 10.1038/234034a0.

Figures

Figure 1. Schematic demonstrating the assembly of the augmented feature vector. For each voxel, all of the mpMR parameter values in a 3x3 square centered on the voxel are obtained (5x9 = 45 parameter values). The relative xy-position of the voxel is calculated by scaling the extent of the prostate capsule in x and y to the range $$$[-1, 1]$$$ and then applying the scaling to the absolute xy-position of the voxel. These are then assembled into a single feature vector with 47 features.

Figure 2. Brief overview of gradient-boosted decision trees. The model is initialized with a weak decision tree $$$m_0$$$. At each iteration $$$i$$$, a new decision tree $$$t_i$$$ is fit to the residual of $$$L(y, m_{i-1})$$$, the error function calculated for $$$m_{i-1}$$$. With gradient boosting, $$$t_i$$$ is approximated by the gradient $$$\nabla L(y, m_{i-1})$$$ evaluated at points of the training set. The improved model is then given by $$$m_i=m_{i-1}+\gamma_it_i$$$ for some $$$\gamma_i$$$ that minimizes the residual of $$$L$$$. At the end of $$$N$$$ iterations, the final model can be viewed as a weighted sum of all of the decision trees.

Figure 3. Brief description of the slice-wise metric $$$s_s$$$ that we recently developed. The following are involved in calculating $$$s_s$$$: (a) Identification of discrete lesions in both ground truth (TR) and prediction (PRED) maps. (b) Calculation of $$$s_\ell$$$, a lesion-wise score we developed based on the Jaccard similarity coefficient⁸ ($$$J_c$$$) that quantifies the degree of overlap and co-localization of predicted lesions ($$$\ell_p$$$) as a measure of prediction quality for each ground truth lesion ($$$\ell_{tr}$$$). (c) Calculation of $$$s_s$$$ as a weighted average of the $$$s_\ell$$$s for a given slice (or across multiple slices). (d) Summary for the example in (a).

Table 1. Comparison of the (a) cumulative AUC and (b) cumulative $$$s_s$$$ of PCa detection for our recently-published LASSO model vs. GBDTs with/without the described corrections for spatial effects. For calculation of $$$s_s$$$, a sensitivity/specificity pair was chosen from the receiver operating curve of each model to maximize $$$s_s$$$ (48% sensitivity, 88% specificity for LASSO; 56% sensitivity, 90% specificity for GBDTs).

Figure 5. Comparisons of prediction maps for the LASSO-based model vs. the proposed GBDTs-based model illustrating disagreements between the voxel-wise AUC and our slice-wise $$$s_s$$$. (a) and (b) Cases in which the models achieved similar AUCs but significantly different $$$s_s$$$. (c) Case in which proposed model was significantly worse in terms of AUC but achieved the same $$$s_s$$$. (d) Case in which the superior performance of the proposed model appears more significant by AUC than by $$$s_s$$$. In each of the four cases, it appears that $$$s_s$$$ more accurately reflects the quality of the predictions.

Proc. Intl. Soc. Mag. Reson. Med. 25 (2017)

4963