Feature selection is a key aspect to radiomics analyses. An approach to remove features which are not stable with respect to small variations of the segmented mask is presented. The rejection works target-class agnostic and can be used in combination with target-class-based selections. An increase of about 5 percentage points can be seen when using the proposed approach in a simple machine learning setup on prostate MRI of prostate cancer patients.
Introduction
Radiomics tools can easily produce more than 1000 features, therefore feature selection is a key ingredient for a stable classification algorithm.1 Features which are not stable with respect to small variations of the segmentation mask of the region of interest are undesirable and should be removed. The approach proposed here identifies the unstable features while being agnostic to the target class of the classification problem.A dataset of 86 patients with histologically proven, non-treated prostate cancer was used in a radiomics analysis. T2-weighted images and ADC maps were obtained for all patients using 3T MRI scanners. Segmentations of the tumors were manually marked by expert radiologists on both the T2-weighted images and ADC maps individually. Features were calculated following the prescriptions of the "image biomarker standardisation initiative" (IBSI) on both the T2-weighted image and the ADC map and include shape features of the segmentation and first-order features of voxel intensities in the segmented region of interest. Furthermore special texture features were calculated from discretized gray-level matrices. In total, 1482 features were calculated using the pyradiomics package2 using different filters as wavelets, Laplacian over Gaussian and local binary patterns.
To assess the influence of the exact delineation of the segmentation, the features were recalculated for variations of the segmentation masks. For each patient the masks were dilated and eroded by one pixel in the plane of high resolution. The impact of these variations on the value of the feature of each patient with respect to the distribution of the values of the feature of all patients was evaluated using the intraclass correlation coefficient (ICC). Figure 1 shows the ICC for all features and the window of ICC between 0.6 and 0.8.
The impact on classification performance was tested using a simple machine learning setup implemented in scikit-learn3: After stability-based feature selection, a fixed number of 25 features was selected using the minimum-redundancy maximum-relevancy (mRMR)4 algorithm to remove effects arising from reducing the number of features. To check the influence of restricting the number of features, a random selection of features matching the number of features at each ICC threshold was evaluated for comparison. The classification performance was evaluated in both cases with a random forest in cross-validation. The target of the radiomics study was the discrimination of high- and low-grade Gleason scores based on the MRI images, therefore the simple machine learning setup was trained to discriminate patients with a Gleason score of at most 6 from patients with a Gleason score of at least 7.
1. Ingrisch M et al. Radiomic Analysis Reveals Prognostic Information in T1-Weighted Baseline Magnetic Resonance Imaging in Patients With Glioblastoma. Invest Radiol. 2017 Jun; 52(6):360-36
2. Griethuysen, J. J. M., Fedorov, A., Parmar, C., Hosny, A., Aucoin, N., Narayan, V., Beets-Tan, R. G. H., Fillon-Robin, J. C., Pieper, S., Aerts, H. J. W. L. (2017). Computational Radiomics System to Decode the Radiographic Phenotype. Cancer Research, 77(21), e104–e107.
3. Pedregosa et al., Scikit-learn: Machine Learning in Python. JMLR 12, pp. 2825-2830, 2011.
4. Brown, Gavin et al. Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection. JMLR 2012.