Shiwen Shen1,2, Xinran Zhong1,3, Willam Hsu1, Alex Bui1, Holden Wu1, Michael Kuo1, Steven Raman1, Daniel Margolis1, and Kyunghyun Sung1
1Department of Radiological Sciences, University of California, Los Angeles, Los Angeles, CA, United States, 2Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA, United States, 3Physics and Biology in Medicine IDP, University of California, Los Angeles, Los Angeles, CA, United States
Synopsis
We present a novel automatic classification method to distinguish between indolent and clinically significant prostatic carcinoma using multi-parametric MRI (mp-MRI). The main contributions are 1) utilizing state-of-art deep learning method to characterize the lesion in mp-MRI through a pre-trained convolutional neural network model, OverFeat, 2) building a hybrid two-order classification model that combines deep learning and conventional statistical features, and 3) avoiding annotation of the lesion boundaries and anatomical-location-specific training. The proposed method was evaluated using 102 lesions of prostate cancer and achieved significantly higher accuracy than the method with traditional statistical features.Purpose
Multi-parametric MRI (mp-MRI) is a promising imaging modality for the
detection and grading of prostatic carcinoma (PCa) [1], but current mp-MRI
scoring systems, such as PI-RADS v2 [1], are generally subjective and have a limited
ability to distinguish between indolent and clinically significant (CS) PCa. Automatic
classification algorithms to improve the current scoring systems are
an active research area [2] but typically require precise suspicious lesion
boundaries, anatomical information, and carefully designed handcrafted features.
Deep learning, a novel machine learning method, has recently garnered attention
because of its superior performance in image recognition. However, applying
deep learning to medical imaging diagnosis is non-trivial due to its requirements
for massive clinical datasets for training. In this work, we propose an
automatic classification method that can overcome the limitation of small
clinical datasets by combining deep features extracted from a pre-trained
convolutional neural network (CNN), known as OverFeat [3], and conventional
statistical features [4].
Methods
With IRB approval, a study cohort of 68 consecutive men who underwent
3.0T mp-MRI (Skyra and Trio, Siemens Healthcare) prior to radical prostatectomy
was included (6/2010–9/2014). Each mp-MRI study, including T2-weighted
(T2w), DWI and DCE images, was correlated with whole mount histopathology by experienced
GU pathologists, and lesions were matched with respect to location, size and
Gleason Score (GS). Indolent PCa cases were defined as
GS smaller than seven (GS ≤ 6) and CS PCa ones were larger or equal to seven
(GS ≥ 7). A total of 102 lesions were identified, including 48 indolent
and 54 CS sub-cohorts.
Figure 1 illustrates our proposed method. The middle slice of regions of
interest (ROIs) suspicious for PCa (annotated by squares) in T2w, ADC
and DCE (Ktrans) images are interpolated and rescaled to 512×512 pixels
(“Pre-processing”). Two training stages are used to obtain the final decision. In the first stage, OverFeat [3], is used to overcome the small
sample size [5]. Deep features from the last convolutional layer (layer 21 in
OverFeat) are employed for each T2w (fT2), ADC (fADC)
and Ktrans (fK) image separately. Three linear SVM
classifiers are then adopted to train fT2, fADC and fK
respectively. In the second stage, the decision values from the three
classifiers are combined with six statistical features to train a Gaussian
radial basis function (RBF) kernel SVM classifier, which outputs the final
decision (indolent vs. CS). Statistical features (fs) include skewness-of-intensity
histogram in T2w images, average ADC value, lowest 10th percentile
ADC value, average Ktrans, highest 10th percentile Ktrans
value, and ROI size in T2w images [1].
The training process is designed as follows. First, the whole dataset is
randomly divided into five folds of similar size. One fold is then selected as
test set IMAGEtest and the other four folds are training set IMAGEtrain.
After this, IMAGEtrain is equally and randomly divided into two
phases, IMAGEtrain1 and IMAGEtrain2. IMAGEtrain1
is employed to train the three linear SVMs in stage 1 with leave-one-out
cross-validation for selecting the optimal parameters. Once trained, the three
trained classifiers are applied to IMAGEtrain2, generating
prediction score vectors. With the prediction scores and fs, IMAGEtrain2
is used to train the RBF SVM in stage 2 and the performance of the prediction is
measured on IMAGEtest. The whole procedure is repeated five times
(known as five-fold cross-validation), where each fold is used as a test set
once. The final classification results are the average
performance of the five-fold cross-validation.
Results
and Discussion
To evaluate the effectiveness of the proposed system, we built four
comparison classification models. Four different SVMs are built using only f
s,
f
T2, f
ADC or
f
K, respectively. The performance of these
models are also evaluated using five-fold cross validation using the whole
dataset. The results are measured using the mean areas under curve, mean
accuracy, mean sensitivity and mean specificity (Table 1). Figure 2 shows the
receiver operating characteristic (ROC) curves. The proposed model achieves the
highest performance compared to other models. The standard model using six statistical
features achieves the lowest performance mainly due to lack of accurate lesion contours
and anatomical-location-specific training. The results also suggest that deep
features significantly contribute to the improvement of the performance.
conclusion
We present a novel and effective framework for improved mp-MRI-driven classification
of indolent vs. clinically significant PCa, combining deep learning and
conventional statistical features. The proposed model achieves significantly
higher accuracy on distinguishing indolent vs. clinically significant PCa
without requiring precise segmentation of lesion boundaries nor location-specific training. Our
method has the potential to improve subjective radiologist based performance in
the detection and grading of suspicious areas on mp-MRI.
Acknowledgements
This study was supported in part by the National Science Foundation (NSF) under Grant No. NSF CCF-1436827.References
[1] Weinreb JC, Barentsz JO, Choyke PL, Cornud F, Haider MA, Macura KJ,
Margolis D, Schnall MD, Shtern F, Tempany CM, Thoeny HC, Verma S. PI-RADS
Prostate Imaging
- Reporting
and Data System: 2015, Version 2. Eur Urol. 2015 Sep 28. pii: S0302-2838(15)00848-9.
[2] Wang,
Shijun, Karen Burtt, Baris Turkbey, Peter Choyke, and Ronald M. Summers.
"Computer Aided-Diagnosis of prostate cancer on multiparametric MRI: a
technical review of current research." BioMed research international (2014).
[3] Sermanet,
Pierre, David Eigen, Xiang Zhang, Michaël Mathieu, Rob Fergus, and Yann LeCun.
"Overfeat: Integrated recognition, localization and detection using
convolutional networks." arXiv preprint arXiv:1312.6229 (2013).
[4] Peng,
Yahui, et al. "Quantitative analysis of multiparametric prostate MR
images: differentiation between prostate cancer and normal tissue and
correlation with Gleason score—a computer-aided diagnosis development
study." Radiology 267.3 (2013): 787-796.
[5]
Ciompi, Francesco, Bartjan de Hoop, Sarah J. van Riel, Kaman Chung, Ernst Th
Scholten, Matthijs Oudkerk, Pim A. de Jong, Mathias Prokop, and Bram van Ginneken.
"Automatic classification of pulmonary peri-fissural nodules in computed
tomography using an ensemble of 2D views and a convolutional neural network
out-of-the-box." Medical image analysis (2015).