Automatic quality assessment of short and long-TE brain tumour MRSI data using novel Spectral Features
Nuno Miguel Pedrosa de Barros1,2, Urspeter Knecht1, Richard McKinley1, Jonathan Giezendanner1, Roland Wiest1, and Johannes Slotboom1

1Institute for Diagnostic and Interventional Neuroradiology, Inselspital, Bern, Switzerland, 2University of Bern, Bern, Switzerland

Synopsis

MRSI-data frequently contains bad-quality spectra which strongly limits its clinical-use. Current clinical practice in our institute is that these bad-quality spectra are filtered out by an MRS-expert, at the expense of long processing times. In this work we present a new method for automatic quality assessment of both long and short-TE MRSI brain tumour data. This method is based upon a novel set of spectral features, and it is as accurate as an expert but considerably faster (3/4 minutes vs 3seconds).

Purpose

To obtain accurate classifiers for assessing spectral quality of short and long-TE MRSI data from brain tumour patients to be used in clinical routine.

Methods

Data was acquired at 1.5T (Siemens Aera, Avanto) using PRESS, CHESS water-suppression, and a 32x32 grid (interpolated from 12x12). In each imaging study, short (30ms) and long (135ms) TE MRSI were recorded sequentially in the same localization. A total of 78 MRSI-recordings from 12 different brain-tumour patients (19032 spectra) acquired pre- and post-operatively, were included in the study. The measurements were performed conforming to local and national ethical regulations. Only spectra from within the PRESS-box were considered. Residual-water-peak-removal was performed prior to feature extraction (jMRUI’s HLSVD1).

The spectra were manually labelled by two expert spectroscopists in either acceptable or non-acceptable, using jMRUI’s SpectrIm plug-in1,2. The features for rejecting spectra were: “ghosting” artifacts, bad-shimming, low-SNR, lipid-contaminations, strongly deviating phase, and post-operative-derived artifacts. First, the experts labelled the spectra independently. Then, they revised together the spectra in which there was disagreement, reaching a consensus-labelling. The resulting consensus-labels constituted the ground-truth used for training and testing the automatic classifiers.

A total of 47 features were extracted from the magnitude time-domain (TD) and frequency-domain (FD) signals. The following types of features were used:

1. Maximum-peak-SNR in given range (FD)

2. Mean-SNR in given range (FD, TD)

3. Relative-change in given range - (TD)

4. Global features (maximum, time-point/ppm-value maximum, mean, standard-deviation, skewness, kurtosis) (FD, TD)

The strategy used for validation was Leave-Patient-Out-Cross-Validation (LPOCV), where at each time the complete dataset of one patient was excluded from the training dataset and used as the testing dataset. Short- and long-TE spectra were handled separately.

A random-forest3 (RF) classifier (“R” implementation, 500 trees, maximum depth) was used for the automatic assessment. To evaluate the relative-importance of the input-features in the classification task of both short- and long-TE spectra, the mean-decrease-in-accuracy3 after feature-permutation was measured.

Results

The agreement between the initial labels of the experts (before reaching a consensus) was 88.78% for long-TE, and 85.04% for short-TE. On average, the experts required 3min-36sec for labelling each MRSI grid (~245 spectra).

The performance indicators for both short- and long-TE data are shown in Figure 1. In Figure 2 the Tukey boxplots of the error-rate per MRSI-grid, are presented. The feature-importance plots for both short- (blue) and long-TE (green) are shown in Figure 3. Finally, in Figure 4 several maps are shown for two example-cases: ground-truth and classifier’s prediction (probability of acceptable) for short- and long-TE, the most important feature for short-TE (TD Mean SNR in the range between 75 and 100 ms) and the two most important features for long-TE (FD-skewness and kurtosis). In order to select representative cases, the two examples presented here were chosen such that they had error-rates close to the median error-rates of both classifiers.

Discussion

The results show that the error of the classifiers is at the same level as the average disagreement between the experts.

Regarding short- and long-TE data, it was shown that is more challenging to assess quality of short-TE than of long-TE spectra. This is confirmed by the higher disagreement of the experts in the assessment of short-TE spectra as well as by the higher error-rate observed in the automatic classification of short-TE data. A possible reason for this is that, besides the higher SNR of short-TE data, artifacts have also a higher SNR, which leads to a smaller number of acceptable spectra (Figure 1). These broad artifacts relax faster than metabolite signals, therefore having a smaller impact in long-TE spectra. Moreover, the variance added by the characteristic short-TE macromolecular baseline (affecting almost every spectral feature) makes the automatic classification of this data more difficult. This might also explain why features such as TD-skewness and kurtosis, that are the two most important features for the classification of long-TE data, drastically lose their importance in short-TE.

Finally, the maps presented for the two cases of Figure 4 show the higher agreement between the classifier’s prediction and the ground-truth. In the same figure the high correlation of the values of the features presented with spectral quality is also visible.


Conclusion

A novel method for automatic-assessment of both short- and long-TE MRSI-spectra was presented. The method shows a level-of-accuracy comparable with the one of an expert and uses a new set of spectral features with high correlation with spectral quality. The method minimizes the experts’ time needed for clinical routine MRSI-analysis.

Acknowledgements

This work was funded by the EU Marie Curie FP7-PEOPLE-2012-ITN project TRANSACT (PITN-GA-2012-316679) and the Swiss National Science Foundation (project number 140958).

References

1. jMRUI website: http://www.jmrui.eu/.

2. Barros N, Jablonski M, Pica A, Starcukova J, Knecht U, Wiest R, Slotboom J. Unifying clinical routine brain tumor MR-Spectoscopy and MR-Image analysis: novel jMRUI plug-ins for brain tumor analysis. Neuro-Oncology 2014;16(suppl 2):ii78-ii78.

3. Breiman L. Random forests. Machine learning 2001;45(1):5-32.

4. Wright AJ, Kobus T, Selnaes KM, Gribbestad IS, Weiland E, Scheenen TW, Heerschap A. Quality control of prostate 1 H MRSI data. NMR Biomed 2013;26(2):193-203.

5. Menze BH, Kelm BM, Weber MA, Bachert P, Hamprecht FA. Mimicking the human expert: pattern recognition for an automated assessment of data quality in MR spectroscopic images. Magn Reson Med 2008;59(6):1457-1466.

Figures

Figure 1 - Performance results for both short- and long-TE data. %Good represents the rate of spectra with acceptable quality in each dataset. The results were obtained using the described cross-validation scheme (LPOCV).

Figure 2 - Tukey boxplots of the error-rate obtained for the several MRSI-grids, for both short and long-TE data.

Figure 3 - Feature-importance-plot for short- and long-TE. For each dataset, the mean and standard-deviation of the error-increase were calculated from the 12 different random-forests that are trained in the described cross-validation scheme (LPOCV). The error bars show the region of the mean ±1 standard-deviation.

Figure 4 - Long- and short-TE ground-truth and classifier’s prediction (probability of acceptable) maps for two example-cases. The maps of the most-important features for short-TE data, as well as the two most-important features for the classification of long-TE data are also shown. All maps were created using jMRUI’s SpectrIm plugin2.



Proc. Intl. Soc. Mag. Reson. Med. 24 (2016)
0022