Sjoerd B Vos1,2, M Jorge Cardoso1, Marzena Wylezinska-Arridge3, David L Thomas1,3, Enrico De Vita3,4, Marios C Yiannakas5, David Carmichael6, John S Thornton3,4, Olga Ciccarelli5, John S Duncan2,7, and Sebastien Ourselin1
1Translational Imaging Group, University College London, London, United Kingdom, 2MRI Unit, Epilepsy Society, Chalfont St Peter, United Kingdom, 3Neuroradiological Academic Unit, Department of Brain Repair and Rehabilitation, UCL Institute of Neurology, London, United Kingdom, 4Lysholm Department of Neuroradiology, National Hospital for Neurology and Neurosurgery, London, United Kingdom, 5Department of Neuroinflammation, UCL Institute of Neurology, London, United Kingdom, 6Institute of Child Health, University College London, London, United Kingdom, 7Department of Clinical and Experimental Epilepsy, UCL Institute of Neurology, London, United Kingdom
Synopsis
Volumetric analyses of 3D
T1-weighted images has become an integral part of the clinical work-up and
research studies. Variation between scanners, in both vendors and models, is a
major confound in combining imaging-derived biomarkers across sites. In this
work, we analyse test-retest data from different days on six 3 T scanners from
three vendors to quantify this inter-scanner variability compared to
intra-scanner variability. Contrast-to-noise ratios as well as volumetric
analyses are performed showing between-scanner variation in total brain volumes
– indicating different scanner calibrations – but also tissue-specific differences
– possibly arising from different effective contrasts.Introduction
Volumetric analyses using 3D T1-weighted acquisitions (3D-T1) has become
an integral part of the clinical work-up for neurodegenerative diseases [1,2], epilepsy [3,
http://hipposeg.cs.ucl.ac.uk], and many clinical trials with both cross-sectional
and longitudinal designs [4]. A major confound in volumetric analysis is acquisition
protocol variation – most notably between vendors but also between specific
models, pulse sequences, and
parameter choices – compromising the integration of imaging-derived biomarkers across sites and their use for the
monitoring and management of patients scanned with different machines. In this
work we evaluate the variations in T1 image quality and volumetry across six 3 T MRI
scanners – from three main vendors – at a single site. This is a crucial step
towards the standardisation of these sequences to improve their translation to
clinical use and their harmonisation between centres.
Methods
Two healthy subjects (28yo male, 46yo female) were scanned on six 3 T
scanners: a Philips Achieva, GE MR750, and Siemens Skyra, Trio, Prisma, and
Biograph mMR (PET-MR). Each subject was scanned three times on all scanners. To
discriminate potentially larger day-to-day variation from same-day scan-rescan changes
[5], each subject underwent two scans on one day with a five-minute interval between
scans and a further single scan on a separate day on each scanner. All
protocols are the established standard sequences used in routine clinical practice
and/or research studies, optimised for different dedicated applications and
patient populations. Acquisition details for each scanner are given in Table 1.
Scans were registered to an intermediate space to visualise contrast
differences between protocols. Tissue
segmentations were performed using Geodesic Image Flows framework (GIF [6]) in
native space, which includes bias-field correction, from which total intracranial
volume (TIV), WM, cortical (cGM) and deep (dGM) grey matter, and ventricular
and non-ventricular CSF volumes were extracted.
Signal-to-noise ratios were quantified within each tissue type by
calculating mean and variance over the tissue, and the CNR between two tissue
types was defined as the absolute difference in mean values between those two
tissues divided by the weighted mean of the variances.
Results
Fig. 1 shows equivalent axial and coronal slices for all six scanners,
in which contrast differences are evident between the six scanners. The
clearest difference is in the GM-WM contrast between the scanners. The protocol
on the Achieva has the highest visual GM-WM contrast, arguably at the cost of lower
cGM-CSF contrast – as confirmed by the estimated CNR (Table 2). By contrast,
images from the Siemens scanners tend to have higher GM-CSF contrast.
Volumetric
analyses are shown in Figs. 2 and 3. Maximum difference in TIV between scanners
is around 60 ml, or just under 4%, where maximum intra-scanner TIV variation is
well under 1%. Variation in TIV across scanners is consistent across subjects.
The differences in contrast (Fig. 1) result in pronounced variations in
segmented tissue volume – even when normalised by TIV. Most notably, the
Biograph mMR yields WM volumes 1% higher than the other scanners, and the MR750
yields cGM estimates around 1-1.5% higher. In both cases there is no linked and
equal decrease in adjacent tissues, indicating that it is not one tissue
interface that is classified differently. Intra-scanner variation is low on all
scanners for WM, cGM, dGM, and non-ventricular CSF, with the lateral ventricles
showing the largest inter-scan variability.
Discussion
Inter-scanner variation in volumetric brain analysis
is larger than intra-scanner variation. Differences in TIV are possibly
indicative of slight gradient calibration variations, leading to minor scaling differences
between the true head dimensions and each scanner’s representation thereof. However,
estimated tissue volumes, normalised for TIV, still vary more across scanners
than within. Slight inter-scanner imaging parameter variations may be the cause
of this. Sequence implementation differences across vendors might also
contribute to between-scanner variation, and the relatively close clustering of
volume estimates across the Siemens Skyra, Trio, and Prisma systems, relative
to those from the other scanners, suggests vendor-based differences – although
the Siemens Biograph mMR shows a large deviation in WM volumes compared to all
other scanners. With respect to CNR estimates, the vendors employ different inline
or optional image reconstruction filters which might bias the comparison across
vendors. Future work will extend this study to a larger subject group and investigate
further metrics including localised analyses and cortical thickness.
Acknowledgements
SBV is funded by the UK National Institute for
Health Research UCLH Biomedical Research Centre High Impact Initiative. Part of
this work was undertaken at UCLH/UCL who received a proportion of funding from
the Department of Health’s NIHR Biomedical Research Centres funding scheme. We
are grateful to the Wolfson Foundation and the Epilepsy Society for supporting
the Epilepsy Society MRI scanner, and the UK MS Society for supporting the NMR
Unit scanner. DLT is supported by the UCL Leonard Wolfson Experimental
Neurology Centre (PR/ylr/18575).References
[1] Heijer et
al., Brain 2010; [2] Rohrer et al., The Lancet Neurology 2015; [3] Winston et
al., Epilepsia 2013; [4] Salloway et al., N Engl J Med 2014; [5] Maclaren et
al., Nature Scientific Data 2014; [6] Cardoso et al., IEEE TMI 2015.