Factors influencing Data Quality in a Multi-Center Breast MR Spectroscopy Trial (ACRIN 6657 Extension)
Patrick J Bolan1, Benjamin A Herman2, Gregory J Metzger1, Eunhee Kim3, David C Newitt4, Savannah Partridge5, Michael Garwood1, Mark A Rosen6, and Nola M Hylton4

1Radiology, University of Minnesota, Minneapolis, MN, United States, 2Center for Statistical Sciences, Brown University, Providence, RI, United States, 3National Institute of Neurological Disorders and Stroke, NIH, Bethesda, MD, United States, 4Radiology, University of California, San Francisco, CA, United States, 5Radiology, University of Washington, Seattle, WA, United States, 6Radiology, Hospital of the University of Pennsylvania, Philadelphia, PA, United States


The ACRIN 6657-extension trial, the first multi-center trial using magnetic resonance spectroscopy (MRS) in breast cancer, has completed and its initial results have been recently published. This study reports on the quality of the MRS data and identifies technical and logistic factors that contributed to a lower-than-anticipated data yield.


The ACRIN 6657-extension trial was the first multi-center trial using magnetic resonance spectroscopy (MRS) in breast cancer. The primary aim was to estimate the accuracy of predicting response to neoadjuvant chemotherapy in patients with locally advanced breast cancer using MRS measurements of the total choline concentration ([tCho]) made very early in treatment. The primary results of this study have been recently published [1,2]. A major limitation of this trial was the low yield of analyzable datasets: of 119 subjects enrolled only 29 cases were usable for addressing the primary aim. The purpose of this study is to evaluate the quality of the MRS data and identify factors that contributed to the lower-than-expected yield.


Nine sites, using 1.5 T and 3 T systems from multiple vendors, participated in the trial. Details of the trial design and MRS measurement procedures are presented in Ref [2]. Briefly, patients received MRI/MRS scans at baseline (TP1), 1-4 days after first treatment (TP2), between chemotherapy regimens (TP3), and prior to surgery (TP4). Single-voxel MRS was acquired with and without water suppression to measure the tCho and T2-corrected water resonances, as previously described [3]. Spectra were processed and fit to estimate peak amplitudes. Detection of a tCho peak was determined using criteria based on SNR, Cramer-Rao bounds, and fit quality. The tCho concentration was calculated in molal units (mmol/kg-water) using water as an internal reference. Raw and processed data were subjectively scored for quality on a good/fair/poor scale, and data scored as poor were removed for the primary analyses.

A quality control (QC) program was used to help control MRS data quality [4]. Prior to enrolling patients, each site was required to qualify by submitting MRS measurements with acceptable accuracy and spectral quality from a trial-specific spectroscopy phantom (shown in Figure 1). Qualification was on a per-system basis, where a system was defined as the unique combination of MR scanner, breast coil, system software, and pulse sequence. Change in any of these required a new system qualification. Phantom QC measurements were acquired biweekly for each system throughout patient acquisition.


Overall Data Attrition: Figure 2 graphically shows the breakdown of data attrition for the primary analyses. 119 subjects were enrolled in the trial. Seven subjects withdrew before completing the study, 3 were ineligible due to development of metastases, 2 received a non-compliant treatment regimen, and 5 were imaged at incorrect timepoints. MR spectroscopy data at either the first or second time-points was not acquired (n=20) or lost (n=15) in 35 of the remaining 102 cases, leaving 67 cases with MRS data available at both TP1 and TP2. Of these, 16 had poor quality MRS, and 18 had no measurable tCho at either TP1 or TP2, leaving 33 cases with analyzable MRS.

Undetectable Pre-treatment tCho: To determine which factors contributed to spectra with unmeasurable tCho, 14 potential explanatory factors reflecting technical and biological factors were identified and independently evaluated using logistic regression models with chi-square tests (shown in table 1). Only four factors were associated with a higher likelihood of tCho detection: low lipid fraction (p<0.001), small voxel size (p=0.04), narrow water line width (p=0.002), and higher field strength (p=0.006).

Spectral Quality: A total of 165 in vivo MRS datasets from 91 patients were submitted for TP1 and TP2. The subjective quality scores were good (47/165, 28%), fair (85/165, 52%), and poor (33, 20%). The poor cases were attributed to system calibration failure (9), a single unstable system (12), poor shim/baseline (3), inconsistency between water and metabolite scans (5), incomplete acquisition (3), and patient motion (1).

Phantom Reproducibility: Repeated phantom measurements of acceptable quality were available for 15 systems (scanner/coil/software/sequence) used in the trial. Figure 3 shows the mean [tCho] and within-system coefficient of variation (wCV = standard deviation / mean) by system, site, vendor, and field strength. The mean [tCho] measured across systems was 0.85 +/- 0.11 mmol/kg-water, which is lower than the prepared concentration of 1.0 mmol/kg. The wCV (lower plot) ranged from 8.0 - 31%, which indicates high variability both within and between systems.

Discussion and Conclusions

The experience of this trial provides guidance for improving quality for future studies using breast MRS. Stronger phantom qualification, specifically focusing on measurement reproducibility, would improve the ability to detect longitudinal changes in vivo. Restricting participation to 3T and sites/systems with good performance would improve tCho detection rates and measurement consistency. The association of water linewidth, lipid fraction, and voxel size with tCho detection suggests that improved voxel placement, either by training or automation, could further improve data quality.


This trial was supported by the National Cancer Institute’s grants to ACRIN (U01 CA079778, U01 CA080098), ECOG-ACRIN (U10 CA180794), and CALGB/ISPY (CA31964, CA33601, CA58207). Additional support was provided by NCI R01 CA120509, NCRR P41 RR00879, and NIBIB P41 EB015894.


1. Bolan PJ, Kim E, Herman BA, et al. Magnetic Resonance Spectroscopy of Breast Cancer for Assessing Early Treatment Response: Results from the ACRIN 6657 MRS Trial. In: Proceedings of the 102nd RSNA, Chicago; 2016. p. 16002277.

2. Bolan PJ, Kim E, Herman BA, et al. Magnetic Resonance Spectroscopy of Breast Cancer for Assessing Early Treatment Response: Results from the ACRIN 6657 MRS Trial. Jounal Magn. Reson. Imaging 2016.

3. Meisamy S, Bolan PJ, Baker EH, et al. Neoadjuvant chemotherapy of locally advanced breast cancer: predicting response with in vivo (1)H MR spectroscopy--a pilot study at 4 T. Radiology 2004;233:424–431. doi: 10.1148/radiol.2332031285.

4. Bolan PJ, Garwood M, Rosen MA, Levering A, Blume JD, Gimpel J, Esserman LJ, Hylton NM. Design of Quality Control Measures for a Multi-Site Clinical Trial of Breast MRS - ACRIN 6657. In: Proceedings of the 16th Annual Meeting ISMRM. Toronto; 2008. p. 1588.


Figure 1 – Trial-specific phantom, showing A) illustration of phantom design, B) placement within a breast coil for scanning, C) Voxel placement and spectra, and D) spectra from normal and control (no PCho) phantoms. Qualification scan required both normal and control scan; reproducibility was performed only in the normal (+PCho) phantoms.

Figure 2 – Breakdown of the 119 subjects enrolled and the eligibility of their datasets for use in the primary study analyses.

Figure 3 – Phantom reproducibility results. The top figure shows the mean and standard deviation of phantom measurements by system number. The lower plot shows the within-system coefficient of variation for each system, and the number of repeated measurements available for each (in parentheses). The site, field strength, and vendor (green=Siemens, blue=GE, magenta=Philips) for each system is indicated below the abscissa.

Table 1 – Results of logistic regression analysis to assess which factors were associated with non-detection of tCho. Higher odds ratios indicate a greater likelihood of not detecting tCho.

Proc. Intl. Soc. Mag. Reson. Med. 25 (2017)