A method is proposed for predicting long-term treatment failure using “eigentumors”: principal components computed from volumes surrounding breast tumors in contrast-enhanced images. The dataset contains pre-treatment scans of 563 consecutively included patients with early-stage breast cancer with median follow-up of 86 months. Principal components of washin and washout in box-shaped regions surrounding the tumors were computed, and LASSO and logistic regression were used to construct a model for predicting the probability of treatment failure. ROC analysis yields a bootstrapped performance of 0.73, and bootstrapped Kaplan-Meier survival curves based on the model’s outcome show significant separation (χ=32.89, P < 0.0001).
The dataset consisted of 1.5 T pre-treatment dynamic contrast-enhanced (DCE) MR images of consecutively included patients diagnosed with early-stage breast cancer. These patients were eligible for breast conserving therapy and were subsequently treated according to the national health guidelines. The outcome under investigation was overall survival (OS). Here, events included death from breast cancer as well as death from other causes1.
Bounding boxes were placed around the tumors. The washin and washout intensities were computed for the voxels inside these bounding boxes. These volumes were then rescaled to uniform sizes and formatted into feature vectors containing the washin and washout intensities. Principal components were computed, and, from the components containing 90% of the data’s variance, candidate components (i.e., candidate eigentumors) for predicting treatment failure at 140 months follow-up were selected by a least absolute shrinkage selection operator (LASSO). The selected components were used for a logistic regression model to predict the probability of an OS event. Performances were evaluated with receiver operating characteristic (ROC) analysis by determining area under the curve (AUC) values. Internal cross-validation was performed by bootstrapping with 1000 cycles and sample size equal to the total number of cases in the dataset. Kaplan-Meier survival curves were computed over the bootstrap cycles. The median and 95% confidence interval curves were determined and the log-rank test statistic was used to determine whether the groups as separated by the prediction model differed significantly.
A total of 563 patients were included. The age at diagnosis ranged from 26 to 84 years, with median age of 57. Tumor diameters ranged between 5 and 90 mm. The median follow-up time was 86 months, and a total of 53 OS events were recorded.
Ninety percent of the variance in the tumor data was explained by 322 principal components. From these components, 28 were selected by the LASSO for prediction of treatment failure (Figure 1). ROC analysis yielded training and bootstrapped AUC values of 0.88 and 0.73 respectively (Figure 2). Stratification of the cases by the model’s bootstrapped outcomes (survival of the patient, “yes” or “no”) showed significantly separation in the Kaplan-Meier survival curves (median values: χ=32.89, P < 0.0001) (Figure 3).
1. Hudis CA, Barlow WE, Costantino JP, Gray RJ, Pritchard KI, Chapman JAW, Sparano JA, Hunsberger S, Enos RA, Gelber RD, and Zujewski JA, Proposal for standardized definitions for efficacy end points in adjuvant breast cancer trials: The STEEP system. J Clin Oncol. 2007; 25(15): 2127-2132.
2. Morris EA, Comstock CE, Lee CH, ACR BI-RADS® Magnetic Resonance Imaging. ACR BI-RADS® Atlas, Breast Imaging Reporting and Data System, 2013. American College of Radiology: Reston, Virginia.